12-01-2011, 03:42 PM | #1 |
Member
Posts: 14
Karma: 10
Join Date: May 2011
Location: Campo Grande, MS, Brazil
Device: Kindle 3
|
Correct Mobi location formula (maybe)
Hi, everyone.
For a while I've been trying to understand Kindle's location and the consensus seems to be that for Mobi 1 location = 128 bytes (it's even on https://wiki.mobileread.com/wiki/Page_numbers). But today I was taking a look at the Amazon Cloud Reader (reader.amazon.com) source code and I found the following: Code:
locationFromPosition: function (a) { return Math.floor(a * 2 / 300 + 1) } The surrounding code is a bit larger, but in the end that seems to be the formula used for calculating the location for the Mobi format (there is another for topaz which is Math.floor((a * d + 100) / 100)). I've tried looking at the parameter "a" and for the text parts it seems to be characters, but I don't know if that means bytes for every case. I've tested it a little bit just to know if it's correct and it seems to be, but not enough for me to be sure. Also, I don't know if the relation holds for images and other things apart from characters. I don't know and couldn't find the original source for the 128 bytes information, so I'm guessing it's an approximation. But as I said, in my initial tests the above formula seems to work. What to you guys think? Does it make sense? Is the 128 bytes info an approximation or is it on the format specs? []'s Rodrigo |
12-01-2011, 05:03 PM | #2 |
Grand Sorcerer
Posts: 28,038
Karma: 199464182
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I've always heard 128 bytes of source html, but since Amazon doesn't release detailed specs, who knows?
|
Advert | |
|
12-01-2011, 05:25 PM | #3 | |
The Grand Mouse 高貴的老鼠
Posts: 72,470
Karma: 309060442
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Quote:
My LotR is encoded with Windows Latin-1. Let's see what happens with a UTF-8 encoded ebook. I suspect that we need to pass bytes, not characters. My copy of Unfinished Tales is a (kindlegen compiled) conversion from an ePub. It's utf-8 encoded. It has 1590431 characters, but 1613151 bytes of mobi-html. In Kindle for Mac it has 10755 locations. 1590431*2/300+1 =10603 (.87333...) 1613151*2/300+1 =10755 (.34) Well. That seems definite. The number of bytes (not characters) through the unpacked mobi-html of the book, when divided by 150, adding 1 and truncating, is the kindle location in the book. Good find! |
|
12-01-2011, 05:30 PM | #4 |
The Grand Mouse 高貴的老鼠
Posts: 72,470
Karma: 309060442
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
I've updated that page in the wiki to reflect this new finding.
|
08-07-2017, 06:25 PM | #5 |
Connoisseur
Posts: 53
Karma: 118948
Join Date: Jul 2014
Device: Kindle PaperWhite 3
|
Sorry for the necropost if that's not allowed, but I've been investigating this and just now found this thread. Given the last post here I decided to search around the mobileread wiki, is this the page: https://wiki.mobileread.com/wiki/Page_numbers ?
If so my question is that it takes as input presumably a byte-offset from the "start" of the book, where the book is the actual .mobi file? So if I have a location (taken from the My Clippings.txt file) I can convert that into a byte-offset position into the mobi file by doing something like (location - 1) * 150? I guess the tricky part would be then mapping that byte-offset into the unpacked contents of the mobi. Basically I'm trying to see if it would be possible to take highlight locations from the 'My Clippings.txt' file on the Kindle device and map them to the actual source file in an unpacked book, but it would probably be easier to just brute-force search for the actual highlighted contents than to try to do this location mapping, especially since I've only seen mention of mobi so I'm not sure if this works the same with azw3. |
Advert | |
|
08-07-2017, 07:40 PM | #6 | |||
Grand Sorcerer
Posts: 6,736
Karma: 86234863
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
|
Quote:
Quote:
Quote:
There is a recent thread in the Amazon Kindle forum, Extract notes from "My Clippings.txt", started by someone who wants to do something similar. See the post from me in that thread for information on how locations map to KF8/AZW3. KFX format, which is used for most Amazon-purchased books on newer kindles, would take a lot more work to deal with. Good luck. Last edited by jhowell; 08-07-2017 at 08:07 PM. |
|||
08-07-2017, 08:46 PM | #7 | |
Connoisseur
Posts: 53
Karma: 118948
Join Date: Jul 2014
Device: Kindle PaperWhite 3
|
Thank you jhowell for responding, I really appreciate it.
Quote:
Put another way, if the offset is 700 but the "first file" (again, I'd need to know what the first file even is) only goes up to 500, then I'd need to know what second file to offset 200 into, correct? What defines that order? Also, just to be sure, you're saying that a Kindle "Position" is just that "raw byte offset" right? So using the formula mentioned in this thread I could go from that raw byte offset to a Kindle "Location" and vice versa. I suppose that if the location I arrive at is not exact, I can at least use it to determine what html file to search for the highlighted text within/around so as to drastically reduce the search space. Thanks for the link to that thread! It definitely seems useful. |
|
08-07-2017, 09:40 PM | #8 |
Grand Sorcerer
Posts: 6,736
Karma: 86234863
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
|
Once you unpack it, MOBI contains the equivalent of only a single HTML file. It is much more primitive than EPUB. The content isn't pure HTML. It has special markup for image references for example.
See the wiki for more details. Last edited by jhowell; 08-07-2017 at 09:55 PM. |
08-07-2017, 10:12 PM | #9 |
Connoisseur
Posts: 53
Karma: 118948
Join Date: Jul 2014
Device: Kindle PaperWhite 3
|
Ah that makes sense then, thanks again!
|
08-09-2017, 08:49 PM | #10 |
Grand Sorcerer
Posts: 28,038
Karma: 199464182
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
KindleUnpack has an option to output raw mobi markup. My guess is that this is the uncompressed markup the locations formula would be based on. Even the monolithic html file that KindleUnpack produces for MOBI books will have been tweaked a bit.
|
Tags |
kindle, location, mobi |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Patch: Use real ASIN instead of UUID in mobi files to show correct cover in KindleApp | siebert | Calibre | 4 | 02-24-2012 09:13 AM |
Does Calibre automatically converty PDF to .mobi to correct dimension ? | bbs7772004 | Devices | 1 | 10-26-2011 08:59 PM |
Calibre and mobi format - creating a paige or location specific table of contents | coaver | Conversion | 2 | 01-25-2011 06:22 AM |
Mobi to Kindle with correct metadata? | rex0810 | Calibre | 3 | 09-25-2009 06:36 PM |
Correct missing author info in a Mobi file? | nekokami | Kindle Formats | 5 | 12-15-2008 11:26 AM |