03-13-2013, 05:32 AM | #511 |
The Grand Mouse 高貴的老鼠
Posts: 72,251
Karma: 309000000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
I shall have to leave detailed discussion of DATP sections to KevinH and DiapDealer.
Nice to see you back at MobileRead. Thanks again for the original code that's been developed into KindleUnpack. |
03-13-2013, 06:51 PM | #512 | |
Bookmaker & Cat Slave
Posts: 11,482
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Most of this is beyond me, but if I may, you can't actually "generate" a K8 file from Amazon's PDS. If you email a file to your own Kindle addy, or use the PDS in any other way, what you get back is not a K8; it's the old mobi (prc) format. So, you can't compare apples-to-apples (in any sense) for an actual K8 created by KindleGen/KP versus the "mobi" (prc) file that you'll get from PDS. You can test this yourself by sending a K8 with, say, an embedded font to the PDS--what you get back will be equivalent to a book made with MBPC. Then sideload the same K8 file with an embedded font to a Fire device directly, either by USB or via an actual (not faux) wifi connection. The "send to Kindle by wifi" prompt you can see on your computer does not use Wifi; it emails the document/book via the PDS. So when I say "wifi," I mean an app like "Wifi File Explorer," which is genuine wifi. You'll see the difference; the USB or wifi-d book will have the embedded font; the PDS book will not. Hope that helps. The DATP stuff is too deep for yours truly, but I thought before you tried to sort this, you should use files that are equivalent. Hitch |
|
Advert | |
|
03-13-2013, 07:06 PM | #513 | |
Enthusiast
Posts: 42
Karma: 11050
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
|
Quote:
I note that the file size listed on Amazon site indicates that it stored a combo file in the cloud. I haven't tested downloading to an older device without KF8, but presumably it gets stripped to KF7. |
|
03-13-2013, 07:16 PM | #514 |
Enthusiast
Posts: 42
Karma: 11050
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
|
I just tested that, and it works as I expected. In this case, the only difference from the KindleUnpacked version is in the metadata; all the data sections are identical to the Amazon stripped version.
|
03-13-2013, 08:08 PM | #515 | |
Bookmaker & Cat Slave
Posts: 11,482
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
ETA: Yup--I just sent a K8-formatted book to my Fire, and now it's working. That's very cool, thank you for this discussion--I wouldn't have found out for ages, given that I'd stopped using the PDS for this very reason. (That, and it's faster to just wifi it, but, still...this way I can tell my clients to email the files to their devices. It will save me untold brain-damage. COOL!) Hitch Last edited by Hitch; 03-13-2013 at 08:17 PM. Reason: He's right, that's GREAT. |
|
Advert | |
|
03-13-2013, 09:00 PM | #516 |
Enthusiast
Posts: 42
Karma: 11050
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
|
Well, it would hardly be the first time Amazon quietly changed something without bothering to tell anyone.
Transferring over USB is why I wanted to strip the files myself. Using PDS has the advantage that more content can be kept in the cloud and fetched from the device, and it syncs reading location. (The latter two things don't seem to work on my Kindle 2, but content can be pushed from the web site.) |
03-14-2013, 06:35 AM | #517 | |
Bookmaker & Cat Slave
Posts: 11,482
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
I could harangue for days over the horsepucky with the SRL change in/around December, which doesn't show up until after the Publishing Workflow (in other words, after the book is put on sale)...and then only in books for which there's no discernible or describable or document-able criterion. I've had not less than 20 back and forth emails with the Mgr of Digital Operations about this one, because it's just WHACK. Anyway, though: thanks again. I really wouldn't have found out for ages, simply because it's not a method we ever used a lot, and on the rare occasions we did, post the advent of K8, the doc conversion was still old-school. Hitch |
|
03-16-2013, 10:34 PM | #518 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
I'm occasionally getting a codec error unpacking calibre-generated news downloads:
Code:
... Write ncx Find link anchors Insert data into html Insert hrefs into html Remove empty anchors from html Insert image references into html Write opf Error: 'ascii' codec can't decode byte 0xe2 in position 84: ordinal not in range(128) Error: Unpacking Failed It would be nice if KindleUnpack would report (including where the offending byte data is) and then ignore this type of error and carry on instead of terminating. I suppose it's possible there is an error somewhere in calibre, but the resulting files work fine on kindles, ipads, etc., so whatever it is it's harmless, and anyway there is no way to figure out where the issue might be in calibre without some useful information from KindleUnpack. Normally I would try to isolate the issue in KindleUnpack myself, but the code has changed and grown so much since I last worked on it that would be a major project for me to get back into it. Hopefully, someone who is up to speed on the current code can deal with this. |
03-16-2013, 11:28 PM | #519 |
Sigil Developer
Posts: 8,109
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi Nick,
That error typically can be generated deep inside the python library code when unicode data is passed between threads but somehow the default python encoding is used and on some platforms this is ascii which causes an error. I thought all of those were fixed in the very latest version of KindleUnpack. Perhaps not. Or perhaps some full unicode data is used in a filename or book title or link target, that should have been properly converted to utf-8 before being written to the opf. Either way please post a zip archive of the problem news feed ebook and I will try to track down what is happening and get it fixed. KevinH Last edited by KevinH; 03-16-2013 at 11:29 PM. Reason: fix typos |
03-17-2013, 12:12 AM | #520 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Kevin - attached is a file that generates this fault.
|
03-17-2013, 12:41 AM | #521 | |
Enthusiast
Posts: 42
Karma: 11050
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
|
Quote:
The interface Amazon provides for maintaining Personal Documents is horrid and basically unusable for maintaining a library. In addition to laborious paging interfaces on the web site, there's no ability to categorize, no way to update a document if you've changed it, and not even any attempt to suppress duplicates. In addition to that, leaving WiFi on to sync location between the Touch and Paperwhite clearly caused the battery on both to drain significantly faster, which greatly nonplussed me. So, screw that—I'll stick with maintaining my Kindles with rsync. |
|
03-17-2013, 11:55 AM | #522 |
Sigil Developer
Posts: 8,109
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi Nick,
It seems the Description metadata item in your np.mobi testcase is properly utf-8 encoded (and it does correctly encode and use non-ascii characters) - notice the smart quotes and accented chars in this snippet. I looked at the Description in a hex editor and all of the smart quotes appear to be utf-8 encoded and not cp1251. --- Key: "Description" Value: "Daily news from the National Post Articles in this issue: Is the war on cancer an ‘utter failure’?: A sobering look at how billions in research money is spent Jean Chrétien: A capable caretaker, but no statesman --- The error you reported seems to happen because utf-8 bytes in the Description metadata element are not properly being handled in either the unescape or xmlescape python library routines. In other words, the bug fix we made to escape html in the metadata text fields properly (you can't have html inside the opf xml metadata, dc:description) is now messing up when utf-8 text is used in someplace inside those libraries. To prove this I made the following change to mobi_opf.py to disable the html escaping. Code:
--- mobi_opf.py~ 2013-01-12 23:40:42.000000000 -0500 +++ mobi_opf.py 2013-03-17 11:38:06.000000000 -0400 @@ -47,7 +47,8 @@ for value in metadata[key]: # Strip all tag attributes for the closing tag. closingTag = tag.split(" ")[0] - data.append('<%s>%s</%s>\n' % (tag, xmlescape(self.h.unescape(value)), closingTag)) + # data.append('<%s>%s</%s>\n' % (tag, xmlescape(self.h.unescape(value)), closingTag)) + data.append('<%s>%s</%s>\n' % (tag, value, closingTag)) del metadata[key] def handleMetaPairs(data, metadata, key, name): I am not sure how these library routines work but somewhere inside they are assuming the string is ascii or converting it through ascii and this causes the error when the bytestring is in fact utf-8. So I will have to dig around in those libraries to see how to fix their issues with handling properly encoded bytestrings. The fix may take a while but in the meanwhile you can simply disable the unescaping via the patch above. Thanks for the testcase. Take care, KevinH |
03-17-2013, 12:19 PM | #523 |
Sigil Developer
Posts: 8,109
Karma: 5450184
Join Date: Nov 2009
Device: many
|
update mobi_opf.py
Hi Nick,
Okay, I think xmlescape and HTMLparser both work better with full unicode strings. At that point, all metadata has already been encoded as utf-8, so I have modified mobi_opf.py to convert all required pieces from utf-8 to full unicode, pass through the xmlescape and escape methods, and then convert back to the needed utf-8 for the opf file. So please give this mobi_opf.py version a try and let me know if it fixes your issues. Thanks, Kevin |
03-17-2013, 07:34 PM | #524 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
|
03-24-2013, 09:24 AM | #525 | |
Grand Sorcerer
Posts: 27,953
Karma: 198500000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
Would it make sense to do something similar (full unicode) in those additional three locations? |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |