02-03-2008, 08:29 PM | #1 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Using perl scripts to produce .IMP ebooks and more...
I had wanted to use command-line based tools to facilitate the conversion of .html directly into .IMP format; bypassing the need for the eBook Publisher GUI. Don't get me wrong, I think the eBook Publisher is a very powerful tool. It is the most effective way to deal with multiple input files especially if they do not 'lend' themselves to be used in a ebook.
To achieve the best results, the .html should first be cleaned-up by 'Tidy'. This will remove those annoying '?', correct ill-formed TOC and clean-up the .html. I needed this primarily as I was converting single .html files from various sources (expanded .PRC/.PDB, exploded .LIT, Blackmask/Project Gutenberg .html, etc.). I found it cumbersome to use the eBook Publisher for just one file, especially if the .html filename was in the format 'authorname - title'.html. For these .html files, I had all the info I needed to properly create a .IMP ebook in the filename; all I had to do was choose the category and I would be finished! Enter the perl scripts... To use these perl scripts, it is required that: 1. You have previously installed the eBook Publisher software from http://www.ebooktechnologies.com/sup...r_download.htm . The perl scripts use 'SBPubX' interface calls to create, view and manipulate .opf and .IMP files. 2. That perl scripts can be executed on your computer. For Windows, I had to install ActivePerl from ActiveState from http://www.activestate.com/store/activeperl/ . BUILDIMPA simple batch file called buildIMP.bat demonstrates how .IMP ebooks can be created using the workhorse routine 'Html2imp.pl'. The 'Html2imp.pl' perl script takes as input four parameters: 'Authorname' 'Title' 'Category' and 'htmlfilename'. If any of the parameters contain spaces, then quotes need to surround that parameter! EXAMINEOPFAfter executing the sample batch file, the .IMP ebook is produced along with the .opf project file used internally. This file can later be loaded into eBook Publisher for further processing, if necessary. This perl script is invoked by 'examineOPF.pl project.opf'. It displays some information about the .opf file and prints out to stdout. If warranted, this output could be redirected to log file. INFOIMPThis perl script is invoked by 'infoIMP.pl ebook.imp'. It displays some information about the .IMP file and prints out to stdout. If warranted, this output could be redirected to log file. VALIDATEOPFA variant of this script is 'infoIMPcsv.pl' which will 'dump' the .IMP details to stdout in 'comma separated values' format. You should redirect the output to a file so it can be opened in Microsoft Excel or similar for further exploration. Another variant is 'infoIMPtab.pl' which will 'dump' the .IMP details to stdout in 'tabbed text' format. Again, you should redirect the output to a file so it can be opened in Microsoft Excel or similar for further exploration. Try these on a directory full of .IMP files and you will get a mini-database of .IMP details! This perl script is invoked by 'validateOPF.pl project.opf'. It validates the .opf showing all errors/warnings and prints out to stdout. Redirect to log file, if warranted. This can be used to extract the error log of a complex .opf build for future study. Please feel free to modify these to suit your needs and consider sharing your achievements for others to benefit.-Nick EDIT: 18-May-2008 added windows executables (see IMP_OPF_windows-executables.zip) of each perl script for those that can't/won't work with perl scripts directly. Last edited by nrapallo; 05-28-2008 at 12:25 AM. Reason: added windows executables (see IMP_OPF_windows-executables.zip) of each perl script |
02-07-2008, 12:03 PM | #2 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
New 'infoIMPdir.bat' showing how to get 'mini-database' of .IMP details
New 'infoIMPdir.bat' showing how to get 'mini-database' of .IMP details. Just open the resulting .csv in Microsoft Excel or similar to further explore your .IMPs!
Just place the 'infoIMP*' files (from the first posting above) in any .IMP directory and execute the 'infoIMPdir.bat' provided below! I know this is 'crude', but the results are worthwhile! -Nick |
Advert | |
|
02-12-2008, 12:47 AM | #3 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
New 'mobi2imp.pl' that will directly convert from mobipocket .prc to .IMP formats
In the Content Forum under the (sticky) Mobiperl thread started by tompe, you will find post #219 that enables you to directly convert from mobipocket .mobi/.prc to .IMP formats via a perl script based on tompe's 'mobi2html'.
This new perl script is named 'mobi2imp.pl' and is available as a windows executable, 'mobi2imp.exe'. MOBI2IMP A simple batch file called mobi2IMP.bat demonstrates how .IMP ebooks can be converted directly from mobipocket .mobi/.prc using the workhorse routine 'mobi2imp'. 'Mobi2imp' takes as input two mandatory parameters: 'MobiSource' and 'ExplodeDir' and three optional parameters: 'Category' 'Authorname' and 'Title'. If any of the parameters contain spaces, then quotes need to surround that parameter! Attached below is the 'mobi2imp.pl' code, 'mobi2imp.exe' as well as two sample conversions in the .zip file for anyone who wants to test it out.To run this manually, just: Code:
perl mobi2imp.pl --verbose "Oliver Twist.prc" Oliver Code:
c:\> mobi2imp.exe --verbose "Oliver Twist.prc" Oliver You must have the eBook Publisher software previously installed as well as the proper perl lib setup**. This will allow those with many mobipocket .mobi/.prc files to migrate them to their ebookwise 1150 easily. Note: ** using 'mobi2imp.pl' requires a tricky setup as I used, as a base, the 'Mobiperl' package prepared by tompe (see his website http://www.ida.liu.se/~tompe/mobiperl/ for detailed setup instructions). This all started out at post #197 in the Mobiperl thread and has evolved into a functional perl script.While it is daunting getting all the right libs, it is now very rewarding that it's setup properly. After all this SETUP, it is easy. I promise! For a MINI-TUTORIAL, check here. For the Mobi2imp Wiki, check here. Enjoy! -Nick Previous changes... EDIT: For a new GUI based Mobi2IMP with many improvements, see Mobi2IMP 9.4 with new Windows GUI & UTF-8
Code:
version 2 - Now 'Category Author Title' are optional and don't need to be provided (if the mobipocket ebook was 'well' composed). version 3 - Now more forgiving of poorly constructed anchors (seen in feedbooks.com .prc's) and will insert the '<a name' tag as long as the 'filepos' points to the start of a tag i.e. "<". This will help retain most, if not, all hyperlinks! version 4 - Things that changed: - Now better warns that eBook Publisher must be installed first. - now takes switches '--1200' and '--1100' to allow for the simultaneous creation of the REB 1200 and REB 1100 versions along with the EBW 1150 .IMP version. - conversly, if the switch '--1150' is specified, then the EBW 1150 .IMP version is NOT created. version 5 - Things that are allowed now: - now allows you to change the text one font size larger ('medium') and one font size smaller (back to 'x-small') by using '--largerfont' and '--smallerfont' respectively. - per JSWolf's request, you can now change margins from the default (2%) to '--nomargins' (0%), '--largemargins' (5%) and even '--hugemargins' (8%) - you can change the default text-align from justify to '--nojustify' (i.e. left aligned). - further to Kovidgoyal's recent 'mobi2oeb' post, now can output in OEBFF (.oeb) output with '--oeb'. As a result, the output can be any and all at once of: '--1150' .IMP, '--1200' .IMP, '--1100' .rb and '--oeb' OEBFF! version 6 - Changes: - per DaleDe's request, you can now change margins from the default (2%) to '--tinymargins' (2px). - no longer requires external program (nconvert.exe); all image 'fixing' done internally by GD.pm (thanks to tompe for this suggestion)! version 7 - Changes: - per DaleDe's suggestion, you can now add small indent with '--indent'. - per JSWolf's request, you can now eliminate (blank line) paragraph separation with '--nopara' (may also need to indent para with '--indent'). - per DaleDe's suggestion, you can now get more info with '--verbose' or '--debug'. - first attempt at a 'readme.txt' - you get this also by executing 'mobi2imp' without any paramenters. version 8 - Changes: - can now override default .IMP naming of 'Author - Title'.ext, by using '--out MYIMPBOOKNAME' to specify .IMP filename produced (omit .ext) - BUGFIX: now strip <body> tag of any BD/mobi specific in-line styles before start 'fixing' things.[/SIZE] EDIT 21 Feb 2008: version 9 - Changes: - mobi2imp.exe (version 9) - windows executable (very stable now!) - can now handle (text) .pdb files properly i.e. ereader 'TEXt'/'REAd' type - now makes the BookDesigner notice at the end 'small print' by default :thumbsup: - can make that BD notice 'big print' with '--BDbig' (case sensitive) - can make that BD notice start on a newpage using '--BDnewpage' :2thumbsup - can even remove that BD notice at the end with '--BDremove' :eek: - to add flare, can use '--bgcolor #FF80FF' to set background color for every page - BUGFIX: Only when using '--nopara' option, some <br />'s get ignored so another <br /> is added; if this creates issues, then '--noBRfix' will not add the second <br />. - better documentation and even a tutorial would be nice - ability to add a (default) 'cover' image to every conversion from .mobi to .imp exists, but not yet ready for the consequences - ability to add running headers (ala GEBLibraian) exists, but not yet fully implemented - add more user defined settings along with some 'Mobiperl' fixes like TOC first, cover link, prefix title... - add Windows GUI ala PDFRead 1.8 Last edited by nrapallo; 10-18-2008 at 08:39 AM. Reason: link to new version 9.4 added |
02-18-2008, 01:52 PM | #4 |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
I have added a description in the wiki for this tool. It is very simple so far and needs additional data but it is a start.
https://wiki.mobileread.com/wiki/Mobi2imp Can the version be added somewhere in the pl file please. I am starting to get confused as to what I have download and what the latest is. (maybe a --v option also to print this out.) Over time the other perl scripts can be added to the wiki also but I just want to get something down today. Dale |
02-18-2008, 02:06 PM | #5 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
To do in 'mobi2imp' version 7 (started but not yet ready for release):
- add '--TOC switch' to add that TOC entry to the beginninf of the file. abandoned. - perl script/source code had version number, but now it is printed out. done. - more documentation/tutorial in the works (thanks for the wiki entry) This program is a testament to the solid foundation provided by tompe's 'mobi2html'. It made the .IMP specific changes so easy to merge from my original 'html2imp.pl'. I never thought it would take off this much, so fast. As more users use it, I will make any 'necessary' corrections/modifications to aid in the direct conversion of .prc to .imp. -Nick Last edited by nrapallo; 02-21-2008 at 06:28 PM. Reason: version 7 now out |
Advert | |
|
02-18-2008, 02:31 PM | #6 | |
Resident Curmudgeon
Posts: 76,370
Karma: 136466962
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
02-18-2008, 03:53 PM | #7 |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
I just built a book using the latest version 6 mobi2imp and it said it built a 1150 but it really built a 1200.
Dale |
02-18-2008, 04:03 PM | #8 | |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
02-18-2008, 06:04 PM | #9 | |
Resident Curmudgeon
Posts: 76,370
Karma: 136466962
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
02-18-2008, 07:06 PM | #10 | |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
02-19-2008, 12:31 AM | #11 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
I've had this happen once or twice before. I re-installed the eBook Publisher software version 2.2.5 and it fixed the issue. I think the libraries might have gotten 'unstable' by some other .IMP making programs (GEBLibrarian, BD, Softbook Word macro, my 'mobi2imp' perl script) -Nick |
|
02-19-2008, 12:33 AM | #12 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
-Nick |
|
02-19-2008, 11:05 AM | #13 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
Version 7 - Changes: - mobi2imp.exe (version 7) - windows executable - per DaleDe's suggestion, you can now add small indent with '--indent'. - per JSWolf's request, you can now eliminate (blank line) paragraph separation with '--nopara' (this sets '--indent' automatically). - per DaleDe's suggestion, you can now get more info with '--verbose' or '--debug'. - first attempt at a 'readme.txt' - you get this also by executing 'mobi2imp' without any paramenters. To follow soon, a tutorial, once I gather enough user feedback. Enjoy! -Nick Last edited by nrapallo; 02-19-2008 at 12:18 PM. |
|
02-20-2008, 09:50 AM | #14 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
VERSION 8 - Changes: - mobi2imp.exe (version 8) - windows executable (very stable now!) - now allow you to specify .IMP filename produced, overriding default naming of 'Author - Title'.ext - BUGFIX: now strip <body> tag of any BD/mobi specific in-line styles before start 'fixing' things. TO DO: - better documentation and even a tutorial would be nice - ability to add a (default) 'cover' image to every conversion from .mobi to .imp exists, but not yet ready for the consequences - add more user defined settings along with some 'Mobiperl' fixes like TOC first, cover link, prefix title... -Nick |
|
02-20-2008, 11:49 AM | #15 |
Resident Curmudgeon
Posts: 76,370
Karma: 136466962
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
nrapallo, I need to get version 7 back to test something. Can you please post it again? Thanks!
Version 8 has a bug in it that strips out blank lines that are supposed to be there. And I think version 7 kept them. That's why I want to test version 7. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to produce epubs for Sony ereader | drmaxx | ePub | 1 | 03-15-2010 11:10 PM |
Anyone use Calibre to produce ebooks from HTML? | AlexBell | Workshop | 10 | 07-03-2009 08:15 AM |
Imp scripts and wine linux related | derrell | Fictionwise eBookwise | 12 | 10-31-2008 05:53 PM |
Perl only access to imp file info | derrell | IMP | 5 | 08-29-2008 11:38 AM |
Can BookDesigner produce an ebook that looks exactly like those from Connect? | Dr. Drib | Sony Reader | 4 | 03-30-2007 09:32 PM |