03-25-2008, 12:41 AM | #31 |
Resident Curmudgeon
Posts: 76,491
Karma: 136564766
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The other Mobipocket to html converter I have actually leaves characters such as the curly quotes and em dashes as the actual characters and not the HTML #s. Cannot mobi2html do the same thing?
|
03-25-2008, 01:28 AM | #32 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
It's crude, but should yield the same results as your java code on the iLiad. No filepos links are fixed, nor <mbp: pbreak>, etc, but I did strip out the images (without file ext). I used other batch files to convert all the .prc to .rb (Rocket eBooks / REB 1100 formats). The left over .html was used to generate .imps, by hand, one at a time. It was tedious and I only converted the .prc I was going to read instead of everything in sight. Sigh, Mobi2IMP had not yet been born!!! The hack was based on 'makedoc9', but I called it 'makedocN' (N as in Nick!). The attached .zip includes everything you need (I hope) to run makedocN. It was compiled with cygwin and required its .dll to execute. Just unzip, place your .prc's in the directory and double-click the 'doprc.bat' and wait for it to finish. What do you think, could I give tompe a run for his money? or should I just stick with what I am good at i.e. .imp! Last edited by nrapallo; 03-25-2008 at 02:04 AM. Reason: fixed typos |
|
03-25-2008, 06:05 AM | #33 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Book Designer uses "<DIV>" for everything, rather than "<P>". All the Mobi books I've posted in the last few months are the result of saving HTML from BD, doing some minor edits (eg replacing "<HR>" with <mbp : pagebreak/>") and then using Mobi Creator to create the PRC.
|
03-25-2008, 10:11 AM | #34 | |
Addict
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
|
Quote:
I thought your 'makedocN' converter was excellent. It preserved the layout, retained all # numeric punctuation, and retained all the original tags. It is very fast and extremely easy to use. I would still remove the following code inserted by Mobipocket Creator: <div height="0em"></div> <div height="0em"></div> but because the closing </p> is preserved this would be an easy find and replace task. The Headings I would change to my preferred simple and clean: <h4>Chapter Number</h4> instead of <h4 align="center"><font size="+1"><b>Chapter Number</b></font></h4> I don't care that it doesn't batch convert. I then tried it on Harry's Lorna Doone Vol 1.prc. I wanted the code to have some white space so I could read it easily so I ran the file through Tidy.exe: Tidy reported: 1777 warnings, 52 errors were found! Not all warnings/errors were shown. This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. So Tidy could not work on the file until the errors had been fixed. I changed the html header to my own one and removed every <mbpagebreak/> and put it through Tidy again. This time Tidy reported: 1454 warnings, 0 errors were found! and the tidied code was easier for me to read. Tidy.exe had corrected all out-of-date upper case tags to lower case, changed <b> to <strong>, changed all to & #160;, all en or em dashes were shown as ? because the original file did not contain the correct html code for these.<i> tags were corrected to <em>. <br/> tags were corrected to <br /> and all <font> tags were removed. But most disconcerting of all, all the paragraphs started and ended with <div></div> respectively. I hate this because it makes for a horrendous task to clean up the html code because of all the other <divs this and <divs that. I use Textpipe frequently to clean up bad html code and I doubt that even this fine programme could easily sort out all these divs. So, if the original .prc file contains good clean code it is a quick and easy task to clean up the resultant html. But if it contains bad outdated code then it aint so easy. I understand that the code generated by say, BookDesigner is adequate in creating good looking ebooks but a peek under the hood reveals out of date bloated code. My aim is to future proof my html files with good clean code that will render faster, reduce size and convert to any format for most reading devices. Thank you and well done nrapallo for your makedocN converter. I shall be using it frequently. |
|
03-25-2008, 10:55 AM | #35 | ||
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
Quote:
Never in my wildest dreams did I think someone else would find it useful and only posted it here, as an alternative, to view the resulting unaltered .html inside a .prc. You, have made my day! |
||
03-26-2008, 01:12 PM | #36 | |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
It might be possible but it seems much more robust to use entities. |
|
03-26-2008, 01:43 PM | #37 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
Entities, be it numeric or words, are more useful; albeit difficult to read. |
|
05-28-2008, 12:31 AM | #38 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Using windows executable (of perl scripts) to produce .IMP ebooks and more...
I added to post #1 above, windows executables (see IMP_OPF_windows-executables.zip) of each perl script in post #1 and post #2 in this thread for those that can't/won't work with perl scripts directly.
Enjoy! Last edited by nrapallo; 05-28-2008 at 12:40 AM. |
01-03-2009, 05:37 PM | #39 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Word2IMP.pl
I finally figured out (through trial & error) what to do to properly use BuildFromWordDoc using the Builder interface of the PubX.dll OLE library as explained here. Even though, you can convert .doc directly into .imp (even in batches) using the BulkConvert program that also "ships" with the eBook Publisher software, this perl script (Word2imp.pl) now allows pre- & post-processing changes to be made using perl scripts!
WORD2IMP A simple batch file called Word2IMP.bat demonstrates how .IMP ebooks can be created using the workhorse routine 'Word2imp.pl'. The 'Word2imp.pl' perl script takes as input a single filename of the .rtf, .doc or .html to convert to .imp. If the filename contain spaces, then quotes need to surround the parameter! EDIT: For a revised Word2imp.pl Perl Script that produces a better .doc to .imp conversion (set CSS=1 or better still use CSS=2)) as detailed in post #41 below.
After executing the sample batch file, the .IMP ebook is produced along with the .opf project file used internally. However, since temporary files are used to build the .imp, the source .html file created gets deleted at the end and is no longer available. A work-around is to create a .oeb ebook and then 'unpack' it to see that intermediary .html file! Then and only then can the ebook be loaded into eBook Publisher for further processing, if necessary. Last edited by nrapallo; 04-23-2009 at 02:01 PM. Reason: see post #41 below for a revised Word2imp.pl Perl script |
01-10-2009, 05:18 PM | #40 |
Wizard
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
|
Nick many thanks for the Perl Script. You did a fantastic job. With a little elbow grease I was able to port your PERL script to VBA and integrate it with the BookCreator tool. The IMP files created from BookCreator are excellent! Many thanks!
Also one recommendation. Change the CSS=1, even though the documentation says the CSS feature is obsolete and not used, I found this to be incorrect. The CSS=1 is required to preserve the MS Word format. =X= |
01-10-2009, 05:49 PM | #41 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
I just tried it and the .doc to .imp sample conversions look absolutely marvellous! I wasn't too happy with my sample conversions when I had Code:
$project->{CSS} = 0; Code:
$project->{CSS} = 1; EDIT: I was able to unpack the .oeb version and display, with the Preview button, different .imp and _1200.imp showing two columns with nice margins! They are not in the .zip files above, but listed below. Looks nice indeed! (Thanks =X=) The attached source html from the .oeb version did not create the same nice margins. I am trying to figure out why the Preview shows the nice margins but the Build Edition... doesn't. I think it has to do with the "oeb-column" settings. Last edited by nrapallo; 01-10-2009 at 09:43 PM. Reason: different .imp and _1200.imp showing two columns with nice margins |
|
01-10-2009, 10:34 PM | #42 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Using <div style="oeb-column-number:auto">
I'm testing the style="oeb-column-number:auto" and it appears that a too large margin-left and margin-right was the culprit in not allowing those nice margins displayed in the previous post's .imp test ebooks.
I now attach a .html/.opf to create a better looking TWO-COLUMN ebook! All you need is the <div style="oeb-column-number:auto"> html style. I don't know what it does, but appears to allow text to be split over two columns. This may have some practical uses, once it's better understood! See this thread entitled "Easily create two column (newspaper-style) ebooks" for an example how to use this style in ebooks! Oh, by the way, the attached .zip contains the 'images' that were converted by word2imp.pl to .wmf format and referenced in the .html. The only problem is that .wmf is NOT supported by eBook Publisher, so I used the generated fallbacks .jpg and copied them over the .wmf pictures (I had to change the .jpg to .wmf!) Last edited by nrapallo; 03-09-2009 at 01:16 AM. Reason: now better understood |
02-12-2009, 05:24 PM | #43 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
PPT2imp.pl
Even though, you can convert MS PowerPoint .ppt directly into .imp (even in batches) using the BulkConvert program that also "ships" with the eBook Publisher software, this perl script (PPT2imp.pl) now allows pre- & post-processing changes to be made using perl scripts!
PPT2IMP A simple batch file called PPT2IMP.bat demonstrates how .IMP ebooks can be created using the workhorse routine 'PPT2imp.pl'. The 'PPT2imp.pl' perl script takes as input a single filename of the PowerPoint .ppt to convert to .imp based on one (rotated) or two 'slides' to a page (not rotated). If the filename contain spaces, then quotes need to surround the parameter!
After executing the sample batch file, the .IMP ebook is produced using temporary files, but the source .html file created gets deleted at the end and is no longer available. A work-around is to create a .oeb ebook and then 'unpack' it to see that intermediary .html file! However, this doesn't work with the latest eBook Publisher v2.3.8 (that understands .epubs) as this part seems broken; no image files are stored in the resulting .epub). |
04-22-2009, 01:05 AM | #44 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to produce epubs for Sony ereader | drmaxx | ePub | 1 | 03-15-2010 11:10 PM |
Anyone use Calibre to produce ebooks from HTML? | AlexBell | Workshop | 10 | 07-03-2009 08:15 AM |
Imp scripts and wine linux related | derrell | Fictionwise eBookwise | 12 | 10-31-2008 05:53 PM |
Perl only access to imp file info | derrell | IMP | 5 | 08-29-2008 11:38 AM |
Can BookDesigner produce an ebook that looks exactly like those from Connect? | Dr. Drib | Sony Reader | 4 | 03-30-2007 09:32 PM |