Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 12-01-2011, 03:42 PM   #1
rodrigoccurvo
Member
rodrigoccurvo began at the beginning.
 
Posts: 14
Karma: 10
Join Date: May 2011
Location: Campo Grande, MS, Brazil
Device: Kindle 3
Lightbulb Correct Mobi location formula (maybe)

Hi, everyone.

For a while I've been trying to understand Kindle's location and the consensus seems to be that for Mobi 1 location = 128 bytes (it's even on https://wiki.mobileread.com/wiki/Page_numbers).

But today I was taking a look at the Amazon Cloud Reader (reader.amazon.com) source code and I found the following:

Code:
locationFromPosition: function (a) {
        return Math.floor(a * 2 / 300 + 1)
}
(The file is KindleReaderApp-min.js, but I won't post the whole link since it's weird and I don't know if it has session informations. I guess you can find it on your own.)

The surrounding code is a bit larger, but in the end that seems to be the formula used for calculating the location for the Mobi format (there is another for topaz which is Math.floor((a * d + 100) / 100)).

I've tried looking at the parameter "a" and for the text parts it seems to be characters, but I don't know if that means bytes for every case. I've tested it a little bit just to know if it's correct and it seems to be, but not enough for me to be sure. Also, I don't know if the relation holds for images and other things apart from characters.

I don't know and couldn't find the original source for the 128 bytes information, so I'm guessing it's an approximation. But as I said, in my initial tests the above formula seems to work.

What to you guys think? Does it make sense? Is the 128 bytes info an approximation or is it on the format specs?

[]'s

Rodrigo
rodrigoccurvo is offline   Reply With Quote
Old 12-01-2011, 05:03 PM   #2
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,038
Karma: 199464182
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I've always heard 128 bytes of source html, but since Amazon doesn't release detailed specs, who knows?
DiapDealer is offline   Reply With Quote
Advert
Old 12-01-2011, 05:25 PM   #3
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,470
Karma: 309060442
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by rodrigoccurvo View Post
Hi, everyone.

For a while I've been trying to understand Kindle's location and the consensus seems to be that for Mobi 1 location = 128 bytes (it's even on https://wiki.mobileread.com/wiki/Page_numbers).

But today I was taking a look at the Amazon Cloud Reader (reader.amazon.com) source code and I found the following:

Code:
locationFromPosition: function (a) {
        return Math.floor(a * 2 / 300 + 1)
}
(The file is KindleReaderApp-min.js, but I won't post the whole link since it's weird and I don't know if it has session informations. I guess you can find it on your own.)

The surrounding code is a bit larger, but in the end that seems to be the formula used for calculating the location for the Mobi format (there is another for topaz which is Math.floor((a * d + 100) / 100)).

I've tried looking at the parameter "a" and for the text parts it seems to be characters, but I don't know if that means bytes for every case. I've tested it a little bit just to know if it's correct and it seems to be, but not enough for me to be sure. Also, I don't know if the relation holds for images and other things apart from characters.

I don't know and couldn't find the original source for the 128 bytes information, so I'm guessing it's an approximation. But as I said, in my initial tests the above formula seems to work.

What to you guys think? Does it make sense? Is the 128 bytes info an approximation or is it on the format specs?

[]'s

Rodrigo
Now, that's an interesting finding. And it can be checked fairly easily. My copy of The Lord of the Rings in Kindle for Mac has 24992 locations. The raw mobi-html is 3748684 bytes long. Plugging that into the forumla you've found, you get 24992 (.22666...). Taking the 128 bytes estimate, you get 29286 (.59375).

My LotR is encoded with Windows Latin-1. Let's see what happens with a UTF-8 encoded ebook. I suspect that we need to pass bytes, not characters.

My copy of Unfinished Tales is a (kindlegen compiled) conversion from an ePub. It's utf-8 encoded. It has 1590431 characters, but 1613151 bytes of mobi-html. In Kindle for Mac it has 10755 locations.

1590431*2/300+1 =10603 (.87333...)
1613151*2/300+1 =10755 (.34)

Well. That seems definite. The number of bytes (not characters) through the unpacked mobi-html of the book, when divided by 150, adding 1 and truncating, is the kindle location in the book.

Good find!
pdurrant is offline   Reply With Quote
Old 12-01-2011, 05:30 PM   #4
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,470
Karma: 309060442
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
I've updated that page in the wiki to reflect this new finding.
pdurrant is offline   Reply With Quote
Old 08-07-2017, 06:25 PM   #5
blaenk
Connoisseur
blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.
 
Posts: 53
Karma: 118948
Join Date: Jul 2014
Device: Kindle PaperWhite 3
Sorry for the necropost if that's not allowed, but I've been investigating this and just now found this thread. Given the last post here I decided to search around the mobileread wiki, is this the page: https://wiki.mobileread.com/wiki/Page_numbers ?

If so my question is that it takes as input presumably a byte-offset from the "start" of the book, where the book is the actual .mobi file? So if I have a location (taken from the My Clippings.txt file) I can convert that into a byte-offset position into the mobi file by doing something like (location - 1) * 150? I guess the tricky part would be then mapping that byte-offset into the unpacked contents of the mobi.

Basically I'm trying to see if it would be possible to take highlight locations from the 'My Clippings.txt' file on the Kindle device and map them to the actual source file in an unpacked book, but it would probably be easier to just brute-force search for the actual highlighted contents than to try to do this location mapping, especially since I've only seen mention of mobi so I'm not sure if this works the same with azw3.
blaenk is offline   Reply With Quote
Advert
Old 08-07-2017, 07:40 PM   #6
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,736
Karma: 86234863
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
Quote:
Originally Posted by blaenk View Post
I can convert that into a byte-offset position into the mobi file by doing something like (location - 1) * 150?
Yes. But it is an offset into the unpacked raw HTML content of the MOBI.
Quote:
Originally Posted by blaenk View Post
I guess the tricky part would be then mapping that byte-offset into the unpacked contents of the mobi.
You can obtain the raw HTML contents of a MOBI file (what location numbers index into) using kindleunpack. You will also have to deal with DRM for many books.

Quote:
Originally Posted by blaenk View Post
Basically I'm trying to see if it would be possible to take highlight locations from the 'My Clippings.txt' file on the Kindle device and map them to the actual source file in an unpacked book, but it would probably be easier to just brute-force search for the actual highlighted contents than to try to do this location mapping, especially since I've only seen mention of mobi so I'm not sure if this works the same with azw3.
Kindle locations are approximate indexes into the book, good enough to get you to the right screen of content. It is hard to tell whether or not that will be good enough for your purpose.

There is a recent thread in the Amazon Kindle forum, Extract notes from "My Clippings.txt", started by someone who wants to do something similar. See the post from me in that thread for information on how locations map to KF8/AZW3.

KFX format, which is used for most Amazon-purchased books on newer kindles, would take a lot more work to deal with.

Good luck.

Last edited by jhowell; 08-07-2017 at 08:07 PM.
jhowell is offline   Reply With Quote
Old 08-07-2017, 08:46 PM   #7
blaenk
Connoisseur
blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.
 
Posts: 53
Karma: 118948
Join Date: Jul 2014
Device: Kindle PaperWhite 3
Thank you jhowell for responding, I really appreciate it.

Quote:
Originally Posted by jhowell View Post
Yes. But it is an offset into the unpacked raw HTML content of the MOBI.
What I'm not sure about is, I don't have much experience with MOBI but with EPUB for example when I've unpacked them I've noticed that they sometimes (often? always?) contain multiple HTML files. If that can be true for MOBI as well, and you say that this is an offset into the unpacked raw HTML, then that implies that there is some defined order so that it is well-define where an offset enters into if it goes past the "first" file (not to mention it would also determine what the first file would be), does that make sense? If so, what determines this order? Would it be some metadata file contained within the MOBI that defines the book-order of the pages, which is itself the order of the files that the offset offsets into?

Put another way, if the offset is 700 but the "first file" (again, I'd need to know what the first file even is) only goes up to 500, then I'd need to know what second file to offset 200 into, correct? What defines that order?

Also, just to be sure, you're saying that a Kindle "Position" is just that "raw byte offset" right? So using the formula mentioned in this thread I could go from that raw byte offset to a Kindle "Location" and vice versa.

I suppose that if the location I arrive at is not exact, I can at least use it to determine what html file to search for the highlighted text within/around so as to drastically reduce the search space.

Thanks for the link to that thread! It definitely seems useful.
blaenk is offline   Reply With Quote
Old 08-07-2017, 09:40 PM   #8
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,736
Karma: 86234863
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
Once you unpack it, MOBI contains the equivalent of only a single HTML file. It is much more primitive than EPUB. The content isn't pure HTML. It has special markup for image references for example.

See the wiki for more details.

Last edited by jhowell; 08-07-2017 at 09:55 PM.
jhowell is offline   Reply With Quote
Old 08-07-2017, 10:12 PM   #9
blaenk
Connoisseur
blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.blaenk 's ceiling is 100% spider-free.
 
Posts: 53
Karma: 118948
Join Date: Jul 2014
Device: Kindle PaperWhite 3
Ah that makes sense then, thanks again!
blaenk is offline   Reply With Quote
Old 08-09-2017, 08:49 PM   #10
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,038
Karma: 199464182
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
KindleUnpack has an option to output raw mobi markup. My guess is that this is the uncompressed markup the locations formula would be based on. Even the monolithic html file that KindleUnpack produces for MOBI books will have been tweaked a bit.
DiapDealer is offline   Reply With Quote
Reply

Tags
kindle, location, mobi


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Patch: Use real ASIN instead of UUID in mobi files to show correct cover in KindleApp siebert Calibre 4 02-24-2012 09:13 AM
Does Calibre automatically converty PDF to .mobi to correct dimension ? bbs7772004 Devices 1 10-26-2011 08:59 PM
Calibre and mobi format - creating a paige or location specific table of contents coaver Conversion 2 01-25-2011 06:22 AM
Mobi to Kindle with correct metadata? rex0810 Calibre 3 09-25-2009 06:36 PM
Correct missing author info in a Mobi file? nekokami Kindle Formats 5 12-15-2008 11:26 AM


All times are GMT -4. The time now is 11:39 PM.


MobileRead.com is a privately owned, operated and funded community.