06-01-2010, 05:22 PM | #1 |
Enthusiast
Posts: 49
Karma: 14
Join Date: Apr 2010
Device: iPad & iPhone
|
ePub Output Bug, Caused by MSWord
There is an annoying bug in the Calibre ePub conversion module, linked to a "feature" of MSWord.
This original text: Code:
to Unseelie Court on King Street and tease Code:
to Unseelie Court on King Street and tease MS Word Generated HTML/XHTML includes "smart tags." When such an HTML file is converted to ePub, these tags are translated, but errant <p> tags are inserted into the new html. Original HTML code: Code:
to <st1:Street w:st="on"><st1:address w:st="on">Unseelie Court</st1:address></st1:Street> on <st1:Street w:st="on"><st1:address w:st="on">King Street</st1:address></st1:Street> and tease Code:
to</p> <address class="calibre8"><span>Unseelie</span> Court</address> <p>on</p> <address class="calibre8">King Street</address> <p>and tease Either erase the MSWord smart tags before converting, or fix the <p> tags by hand after converting (unzip ePub, edit .html or .xhtml files, rezip). This has been reported as ticket #5671 in the Calibre Bug Tracking system. |
06-01-2010, 06:05 PM | #2 | |
Well trained by Cats
Posts: 30,352
Karma: 58032210
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
I noticed that street names seem to get broken up instead of just Italicized. Figured Kovid liked it that way |
|
Advert | |
|
06-01-2010, 10:41 PM | #3 |
Grand Sorcerer
Posts: 6,215
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Hi Daddy Warpigs,
If you generate your HTML using MSWord, you should use the SaveAs Webpage-Filtered option rather than SaveAs Webpage. The "smart tags" should then not be created in your generated HTML and there is no need for manual editing. |
06-02-2010, 09:03 AM | #4 |
Wizard
Posts: 1,763
Karma: 30063305
Join Date: Dec 2006
Location: Singapore
Device: Boyue
|
I would also recommend passing the html file through html tidy.
That cleans up many of the crap word add to the file. I have seen files go down from 1 mb to about 500kb sometimes |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Old Thread] Epub Output: Line Height | greenapple | Conversion | 20 | 01-27-2013 09:27 AM |
EPUB output | kovidgoyal | Calibre | 920 | 02-05-2011 11:59 AM |
EPUB output justification | toki08 | Calibre | 10 | 01-08-2011 04:14 PM |
Seems Amazon have caused an epub price war in the UK | ceebee_uk | General Discussions | 11 | 09-27-2010 04:20 AM |
epub output metadata | troymc | Calibre | 5 | 05-22-2010 12:23 AM |