Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 09-01-2010, 02:01 PM   #1
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,538
Karma: 309500000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Kindlestrip Python script and AppleScript wrapper

Kindlegen, Kindle Comic Creator and Kindle Previewer add the source files used in compiling the kindle ebook as one of the (invisible) records in the kindle ebook.

So I wrote a python script that strips out the sources record from Kindle format ebooks. And for those on Macs I wrote a nice Applescript wrapper and also put the python script in the AppleScript bundle to make things easy.

Kevin Hendricks has since updated the code to handle files from KindleGen 2.x, and I've also tweaked a bit more to handle KindleGen 2.7.

If you're going to upload to the Amazon store, this script is usually unnecessary, as Amazon will strip the sources before delivery anyway.

Do not use this script to make files to be uploaded to KDP, unless you have to because of size constraints on uploaded

Kindlegen now includes the option to not add the source files to the end of the generated book. So if you're using Kindlegen and want a file without the sources added, don't use KindleStrip, but specify this option in Kindlegen to get guaranteed correctly formatted books.

If you're on a Mac you only need the Applescript, as it includes the Python script in it. The Applescript is a simple drag&drop operation — drag your KindleGen generated file onto it, and it creates one named [oldname]_stripped.mobi.

As always, please comment with any bug reports or problems.
Attached Files
File Type: zip KindleStrip 1.36.app.zip (35.1 KB, 7257 views)
File Type: zip kindlestrip_v136.py.zip (4.8 KB, 13407 views)

Last edited by pdurrant; 06-12-2014 at 06:07 PM.
pdurrant is offline   Reply With Quote
Old 09-03-2010, 06:23 PM   #2
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,538
Karma: 309500000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Now at version 1.1. Writes out the stripped data as a zip file. The data in the Mobipocket file seems to have a 16 byte header that's written out as hexadecimal to the standard output. Thos using the AppleScript won't see this at all. I have no idea what the 16 bytes mean, so this probably isn't a loss.
pdurrant is offline   Reply With Quote
Advert
Old 09-03-2010, 06:40 PM   #3
daffy4u
I'm Super Kindle-icious
daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.daffy4u ought to be getting tired of karma fortunes by now.
 
daffy4u's Avatar
 
Posts: 6,734
Karma: 2434103
Join Date: Apr 2008
Location: Long Drive, Calinadia Candafornia
Device: KDXG, KT, Oasis
Thanks pdurrant! I don't have any books to upload to Amazon but I always appreciate the efforts of those who push the Kindle limits to make it even more useful.
daffy4u is offline   Reply With Quote
Old 09-24-2010, 01:23 AM   #4
ATDrake
Wizzard
ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.
 
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
Just tried this on a couple of auto-generated mobis made via the new version of Kindle Previewer (1.5).

It now has"ePub support", by which it means that it automatically converts any ePubs dragged upon it to mobi and drops the file in the same folder, apparently on the lower -c1 compression setting. Also a new simulation option for iPad, but no K3 mode yet. But the people trying to figure out Kindle Audio/Video now have a new testing tool for their efforts.

Anyway, the stripping works a treat and the extraction gives back almost exactly went in, as far as I can tell. Did a few more tests with my lazily assembled Fictionwise cleanup conversions and html comes back as zipped html, and a zipped up ePub in yields the exact same zipped-up ePub out.

Interestingly enough, if you originally pointed KindleGen at an opf (either custom or via unpacked epub), then no matter what the source structure, the unzipped-from-stripped version yields up the css, html, image, and misc (ncx, etc.) files rearranged into separate subdirectories with exactly those names.

Stripped file has immense space savings, often near-halving; sometimes more if there are a fair number of graphics involved in the source. Even pure text with no pictures is over a third smaller.

I have absolutely no idea why Amazon would remove the entirely logical -donotaddsource option unless they actually want to serve up plenty of bloated files via 3G and cut down on the marketable "Kindle can hold #### books!" space (and deduct extra from royalties paid out, of course), which seems rather counter-productive to me.

While we're on the subject of inexplicable KindleGen design decisions, might as well mention some more things I found out while using it:
  1. Plain old descendent selectors, a staple since CSS1, seem to be completely ignored. Another black mark for KindleGen's (lack of) CSS support and means that one will likely have to class every item one wants to target with a particular style not shared with its siblings, rather than classing a container parent element for the lot and letting specific descent rather than generic inheritance take place.
  2. If you forget to close a <div> with styling applied, all subsequent text seems to be rendered with the same styling, even if it occurs in separate files in the source, at least until it hits the next tag with a different style.
  3. If you have any superfluous tags in your NCX, even a mistakenly applied empty closing tag like say, </head>, then KindleGen will merrily ignore your painstakingly constructed <navMap> and happily build with nary a warning until you find out that your mobi has no chapter marks and spend far too long trying to figure out why.
Thanks again for writing this script! I'm sure people will be finding it very useful if Amazon's going to insist on always including the source files.
ATDrake is offline   Reply With Quote
Old 09-24-2010, 02:17 PM   #5
ATDrake
Wizzard
ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.
 
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
Also, I think I've figured out what the mysterious header bytes mean.

If your source was converted straight from a properly zipped ePub, then you get 53524353000000100000003000000001. If it came from any combination of un-prepackaged html/opf, it'll be 53524353000000100000002f00000001. If it's a no-source-files-added mobi to begin with, then the header bytes are 46434953000000140000001000000002.

And it seems that even the samples offered for the newer books at Amazon nowadays include the bloat (but only from the mobi conversion and cut off appropriately at the sample length), which looks like it's a useless expenditure to me.

Ah well, if they want to waste their server bandwidth for no good reason, that's entirely up to them. As long as they don't go back to charging that extra $2 Whispernet surcharge that they finally got rid of for Canadians.
ATDrake is offline   Reply With Quote
Advert
Old 02-21-2011, 01:55 PM   #6
twedigteam
Enthusiast
twedigteam began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Nov 2010
Device: Sony eReader
If anyone still takes a gander at this thread, having some issues running the Kindlestrip tool on OSX10.6.6; a simple drag & drop of a .mobi file onto the AppleScript file doesn't actually cause anything to occur...taking a closer look, I'm wondering if the inherent Python files on my Mac are outdated to run kindlestrip properly (I had no issue at whatsoever using your ePub zip/unzip scripts, but I could be misled in that they don't use the Python language?). My version:

Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)

Noticed that there is a more current build of 3.2, wondering if maybe this could be the issue? I'm sure also there is a way to run from Terminal, but I am certainly not at that level of familiarity with Python to do so....thanks in advance if anyone spots this...
twedigteam is offline   Reply With Quote
Old 02-21-2011, 02:08 PM   #7
ATDrake
Wizzard
ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.ATDrake ought to be getting tired of karma fortunes by now.
 
Posts: 11,517
Karma: 33048258
Join Date: Mar 2010
Location: Roundworld
Device: Kindle 2 International, Sony PRS-T1, BlackBerry PlayBook, Acer Iconia
I'm also on 10.6.6 and the AppleScript has been working for me for the past couple of months and again when I used it yesterday.

I used to have the standard Python 2.6-ish install, but then I went and got the 2.7.1 installer from Python.org (after the source failed to compile, grr).

Maybe your unzip utility sets the permissions wrongly?

In any case, to use it on the command-line, just do python PATH/TO/kindlestrip.py OriginalFile.mobi OutputFile.mobi OptionalStrippedData.zip

You can drag and drop the kindlestrip.py file onto the Terminal window and it will autofill its path, and the 3rd filename is optional if you don't care about looking at the stripped data.

You can also alias it in your .profile for convenience, aka:

alias kstrip="python PATH/TO/kindlestrip.py"

and then string together a series of commands to batch process a folder:

alias kstripbatch='for m in *.mobi; do kstrip "$m" "${m/.mobi/-stripped.mobi}"; done'
ATDrake is offline   Reply With Quote
Old 02-21-2011, 04:03 PM   #8
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,538
Karma: 309500000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by twedigteam View Post
If anyone still takes a gander at this thread, having some issues running the Kindlestrip tool on OSX10.6.6; a simple drag & drop of a .mobi file onto the AppleScript file doesn't actually cause anything to occur...taking a closer look, I'm wondering if the inherent Python files on my Mac are outdated to run kindlestrip properly (I had no issue at whatsoever using your ePub zip/unzip scripts, but I could be misled in that they don't use the Python language?). My version:

Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)

Noticed that there is a more current build of 3.2, wondering if maybe this could be the issue? I'm sure also there is a way to run from Terminal, but I am certainly not at that level of familiarity with Python to do so....thanks in advance if anyone spots this...
I can't think why it wouldn't work for you. It works here. You don't need Python 3.x. Most of the python scripts around are written for Python 2.x where x>=5, including this one. It may well not work with 3.x at all.

What happens if you just double-click the applescript? (It should ask you to locate kindlestrip.py - just click cancel if it does.)
pdurrant is offline   Reply With Quote
Old 02-21-2011, 08:25 PM   #9
twedigteam
Enthusiast
twedigteam began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Nov 2010
Device: Sony eReader
Quote:
Originally Posted by ATDrake View Post

In any case, to use it on the command-line, just do python PATH/TO/kindlestrip.py OriginalFile.mobi OutputFile.mobi OptionalStrippedData.zip
Worked like a charm. Clearly no issue with the code if this goes through. I'll retry the script on a coworkers system later in the week.

Once again, a tip of the hat...the help here is impressively reliable, and kudos on the tools....
twedigteam is offline   Reply With Quote
Old 02-21-2011, 08:27 PM   #10
twedigteam
Enthusiast
twedigteam began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Nov 2010
Device: Sony eReader
Quote:
Originally Posted by pdurrant View Post
I can't think why it wouldn't work for you. It works here. You don't need Python 3.x. Most of the python scripts around are written for Python 2.x where x>=5, including this one. It may well not work with 3.x at all.

What happens if you just double-click the applescript? (It should ask you to locate kindlestrip.py - just click cancel if it does.)
Double-clicking does ask to locate the .py file, and I've tried every possible combination, including removing the scripts and re-downloading them. As I mentioned above, it works fine in command line so the AppleScript issue is just a local one

....thanks again!
twedigteam is offline   Reply With Quote
Old 02-22-2011, 04:29 AM   #11
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,538
Karma: 309500000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by twedigteam View Post
it works fine in command line so the AppleScript issue is just a local one
How odd. You could try opening it with Script Editor and re-saving.
pdurrant is offline   Reply With Quote
Old 02-22-2011, 06:54 AM   #12
Piquan
Junior Member
Piquan began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Feb 2011
Device: Kindle 3
Thanks for your investigation, and the tool!

After I got v1.1, I added, at line 78 (just after calculating penoffset and lastoffset), the following:
if datain[self.penoffset:self.penoffset+4] != 'SRCS':
raise StripException("already stripped")

The intention here is to not delete the FCIS segment from an already-stripped file (or one that was generated with -donotaddsource). I'm enough of a doofus that I'm sure to mess up something by stripping it twice! (On the other hand, I've found one source that says the FLIS and FCIS segments aren't necessary for the Kindle, so at least I'd get a few second chances.)

Thanks again!
Piquan is offline   Reply With Quote
Old 02-22-2011, 07:19 AM   #13
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,538
Karma: 309500000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by Piquan View Post
Thanks for your investigation, and the tool!

After I got v1.1, I added, at line 78 (just after calculating penoffset and lastoffset), the following:
if datain[self.penoffset:self.penoffset+4] != 'SRCS':
raise StripException("already stripped")

The intention here is to not delete the FCIS segment from an already-stripped file (or one that was generated with -donotaddsource). I'm enough of a doofus that I'm sure to mess up something by stripping it twice! (On the other hand, I've found one source that says the FLIS and FCIS segments aren't necessary for the Kindle, so at least I'd get a few second chances.)

Thanks again!
What a good find. I hadn't realised that there were some constant bytes in that bit. I'll see if I can get an updated version done.
pdurrant is offline   Reply With Quote
Old 03-03-2011, 08:14 AM   #14
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,538
Karma: 309500000
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by pdurrant View Post
What a good find. I hadn't realised that there were some constant bytes in that bit. I'll see if I can get an updated version done.
Now updated to version 1.2, adding the sanity checking suggested by Piquan.
pdurrant is offline   Reply With Quote
Old 09-19-2011, 03:41 AM   #15
Xabache
Carbon Reserve
Xabache began at the beginning.
 
Xabache's Avatar
 
Posts: 44
Karma: 10
Join Date: Jun 2010
Device: PC
Could someone write a step by step tutorial for this using kindlestrip. I have tried to follow along but fell flat on my face despite being generally knowledgeable of computers. Step by step please, download this, drag that... Thanks.

Last edited by Xabache; 09-19-2011 at 03:45 AM.
Xabache is offline   Reply With Quote
Reply

Tags
k5 tools, mobi2mobi


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Applescript Wrapper Application for Kindlegen pdurrant Kindle Formats 50 02-18-2020 02:16 AM
how to use python script with windows xp tuufbiz1 Other formats 12 01-08-2011 09:22 AM
How do I get a shortcut for a Python script onto the taskbar in W7? Sydney's Mom Workshop 6 03-28-2010 09:11 PM
Nedd a little help with a python script gandor62 Calibre 1 08-07-2008 10:59 PM
Python script to create collections gwynevans Sony Reader Dev Corner 2 03-13-2008 01:29 PM


All times are GMT -4. The time now is 12:14 PM.


MobileRead.com is a privately owned, operated and funded community.