05-03-2015, 10:12 PM | #1 |
Fanatic
Posts: 527
Karma: 1048576
Join Date: May 2009
Device: bebook; prs-950; nook simple touch; HTC Jetstream tablet
|
Problem converting to azw to epub
Recently I purchased a azw book from Amazon and converted it to epub with the most recent calibre version. It converted ok, but there were abundant spelling errors throughout the book. It looked like the results of a ocr that had converted text and couldn't distinguished many letters and letter combos. I had never encountered such with calibre before.
In addition, after the conversion the results were an epub and an htmlz book - usually calibre (2013 version) gives the original azw file rather than an htmlz file. Does anyone have any ideas what happened and how to correct it? Last edited by bobcdy; 05-03-2015 at 10:17 PM. |
05-03-2015, 10:24 PM | #2 |
Ex-Helpdesk Junkie
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
HTMLZ? I regret to inform you that you have stumbled upon one of the rare Topaz format experiments, and discovered exactly why it is so horrible.
Topaz used, IIRC, some form of embedded image, backed by OCR. The DeDRM plugin and calibre as well can only extract the OCR layer. Basically it is the PDF of ebook formats. |
Advert | |
|
05-04-2015, 12:59 AM | #3 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
you have 7 days to return any e-book to amazon if you are not happy. they will give a full refund. Just don't make a habit of it or your account could get suspended for excessive returns
|
05-04-2015, 06:08 PM | #4 |
Fanatic
Posts: 527
Karma: 1048576
Join Date: May 2009
Device: bebook; prs-950; nook simple touch; HTC Jetstream tablet
|
Thanks to both of you for the information about my kindle book conversion. The Kindle book has almost no errors but there are tons of spelling errors in the epub conversion. I'm in the process of correcting them with Sigil spellcheck.
|
05-04-2015, 07:13 PM | #5 |
Ex-Helpdesk Junkie
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
The Topaz format looks proper when you first read it, because it uses images of the proper text. But then you try DeDRMing and converting, or highlighting, and you hit the invisible OCR layer.
The Kindle book has lots of errors, just not in the image overlay. |
Advert | |
|
05-05-2015, 10:43 AM | #6 |
Bookaholic
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
|
Italics will also be missing so if you want those you'll need to put them in manually.
|
05-05-2015, 12:04 PM | #7 | |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
I have a number of old Egyptology books in Topaz format that really can't practically be converted to Mob/ePub format because they contain a lot of non-standard glyphs (Egyptian hieroglyphs, etc). Topaz is an ideal format for such material, because it's a fraction the size that a page-scan would be, and it's reflowable. Last edited by HarryT; 05-05-2015 at 12:15 PM. |
|
05-05-2015, 02:37 PM | #8 |
Ex-Helpdesk Junkie
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Hmm, I suppose it *might* have certain niche use. I tend not to read books with those character sets, and generally don't think of them.
One Topaz book that I remember getting was an innocuous copy of The Princess Bride. Some lazy publisher didn't want to bother converting, I guess. Question: Isn't that what the Private Use Areas of unicode are for, and shouldn't those books use a custom font for that purpose, as part of an otherwise-bog-standard EPUB/AZW3? |
05-05-2015, 02:51 PM | #9 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Sure, but these are 100+ year-old books that have been out of print for decades, and there's just not the market that would make it economic to create ePub versions of them at a sensible price. I'm happy to have them available, and Topaz works well if you don't have a requirement to convert them.
|
05-08-2015, 12:03 PM | #10 |
Fanatic
Posts: 527
Karma: 1048576
Join Date: May 2009
Device: bebook; prs-950; nook simple touch; HTC Jetstream tablet
|
I first concluded I could correct the spelling mistakes but found that there were no italics and even worse there were no paragraphs - each chapter was one paragraph. I could have inserted them with Sigil (I have the paperback version, but the small print of the 600p book is getting too difficult for my eyes for extended reading) but was too much work so I gave up on correcting the conversion. Lack of paragraphs was too much, so I finally read the topaz version on my largest android tablet. Larger text in the newer Kindle for android is possible, although the margins are comparatively wide with larger text sizes. In the topaz often the letters were poorly formed, esp. the H in which the first upright looked like an I next to the crossbar and right upright together. I won't send the topaz back because I read it, but probably will if I ever buy another one.
Last edited by bobcdy; 05-08-2015 at 12:14 PM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Closed] AZW not converting to ePub | Tatezi | Conversion | 9 | 03-22-2015 11:59 PM |
Problem converting AZW to epub | Alleyoop | Sony Reader | 6 | 03-17-2012 06:32 AM |
Maintaining highlighted text converting AZW to EPUB? | Mkoenig | Conversion | 2 | 06-04-2011 09:48 PM |
Converting AZW to ePub | narfit | Introduce Yourself | 3 | 02-28-2011 09:18 AM |
Problem converting Google Scholar HTML to AZW | JerryCC | Amazon Kindle | 5 | 02-08-2010 08:38 PM |