07-16-2009, 06:49 PM | #1 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2009
Device: Sony PRS505
|
Converting *big* multi-file HTML doc for PRS-505 reader
Hi guys,
I'm a software developer by trade, using various programming languages (mainly C, PHP and C++, but some others as well). I also have a notoriously bad memory, and tend to end up scrabbling through my collection of half a dozen or so quick-reference guides, and the documentation for the languages I use. So I figure, I have a PRS505 (Sony e-Reader), so why not put some of them on there... Well, I've spent a good chunk of the last two days trying to convert the PHP documentation (the 5MB multi-file version with the table of contents) to LRF format so I can read it on the Reader. I started out by using Calibre, drag-dropping the index.html file onto the main window and converting it to LRF. This left me with a ~6-page LRF containing a great conversion of the TOC, but nothing else. So I moved on a bit, and tried using the conversion utility (html2lrf) directly. I've done this on Windows XP (32-bit) and Ubuntu 8.10 (64-bit), in both cases with the "reduce memory usage" option turned on and with it off. In all cases, the conversion runs almost to the end ("rationalizing font sizes"), eats ~2GB of RAM, then dies -- Linux kills the converter off, Windows allows it to eat all the RAM it likes, then the OS freezes solid (but not after some impressive graphical effects, like window borders and buttons disappearing). Does anyone know of any HTML-to-LRF converters that can handle documents as big as the PHP manuals, or any ways to make Calibre do this without eating so much RAM? The PHP documentation source file I'm using is freely downloadable from http://uk3.php.net/download-docs.php -- I'm using "English, many files, .tar.gz" Cheers, Phil. |
07-16-2009, 06:55 PM | #2 |
creator of calibre
Posts: 44,381
Karma: 23766374
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
convert it to epub
|
Advert | |
|
07-16-2009, 06:58 PM | #3 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2009
Device: Sony PRS505
|
|
07-16-2009, 07:00 PM | #4 |
creator of calibre
Posts: 44,381
Karma: 23766374
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You couldn't even ssh into it? really?
|
07-16-2009, 07:16 PM | #5 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2009
Device: Sony PRS505
|
I didn't try that, but X was frozen solid (mouse cursor wouldn't move) and numlock/capslock were unresponsive.
From past experience, if hitting numlock doesn't make the keyboard light blink, it's probably time to hit the Big Red Switch... |
Advert | |
|
07-16-2009, 07:34 PM | #6 |
Sir Penguin of Edinburgh
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
I just had a look. Do you realize that it has over 9 _thousand_ html files? No wonder calibre crashed.
|
07-16-2009, 07:40 PM | #7 |
creator of calibre
Posts: 44,381
Karma: 23766374
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Oh I've often rescued machines that don't respond to any input device by sshing into them. calibre is designed to keep everything in memory while it converts, so your machine may be running out of memory.
|
07-16-2009, 07:45 PM | #8 | ||
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2009
Device: Sony PRS505
|
Quote:
Quote:
Though that said, the WXP version got to 2GB then died quite horribly, so maybe something similar is happening here. I'm not going to shout "bug!" because IMHO that's a bit like shouting "fire!" in a crowded theatre |
||
07-16-2009, 07:51 PM | #9 |
creator of calibre
Posts: 44,381
Karma: 23766374
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The whole conversion framework has been redesigned in calibre 0.6, try that (prefereably in linux) ans see it works. This time keep an eye on htop
|
07-17-2009, 01:47 PM | #10 | |
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2009
Device: Sony PRS505
|
Quote:
If so, the link on http://pypi.python.org/pypi/calibre/ (and in various places on the wiki) seems to be broken. My attempts to check out the source from Bzr got me this message: Code:
philpem@cheetah:~/calibre$ bzr branch http://bzr.kovidgoyal.net/code/calibre/trunk calibre bzr: ERROR: Connection error: Couldn't resolve host 'bzr.kovidgoyal.net' (-2, 'Name or service not known') |
|
07-17-2009, 01:49 PM | #11 |
creator of calibre
Posts: 44,381
Karma: 23766374
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
bzr co lp:calibre
but there are links to precompiled beta releases of calibre in a sticky in this forum |
07-17-2009, 05:00 PM | #12 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2009
Device: Sony PRS505
|
OK, seen the sticky. I'll have a play with that in a bit.
html2epub didn't like it though: Code:
Splitting getting-started.xhtml (2 KB) Splitting on page breaks... Looking for large trees... No large trees found Splitting indexes.xhtml (648 KB) Splitting on page breaks... Looking for large trees... Traceback (most recent call last): File "/usr/bin/html2epub", line 8, in <module> load_entry_point('calibre==0.5.14', 'console_scripts', 'html2epub')() File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/from_html.py", line 543, in main File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/from_html.py", line 480, in convert File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/split.py", line 500, in split File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/split.py", line 76, in __init__ File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/split.py", line 166, in split_to_size File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/split.py", line 166, in split_to_size File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/split.py", line 166, in split_to_size File "build/bdist.linux-x86_64/egg/calibre/ebooks/epub/split.py", line 152, in split_to_size calibre.ebooks.epub.split.SplitError: Could not find reasonable point at which to split: indexes.xhtml Sub-tree size: 647 KB I'm going to have a go with 0.6 as soon as the binary finishes downloading (assuming I can make a couple of i686 binaries work properly on x86_64). EDIT: Nope, it's not playing ball. Code:
philpem@cheetah:~/phpdoc/html$ ~/calibre/prebuild/ebook-convert index.html ../phpdoc.lrf Traceback (most recent call last): File "/tmp/init.py", line 45, in <module> File "/home/kovid/work/calibre/src/calibre/ebooks/conversion/cli.py", line 214, in main File "/home/kovid/work/calibre/src/calibre/ebooks/conversion/cli.py", line 203, in create_option_parser File "/home/kovid/work/calibre/src/calibre/ebooks/conversion/plumber.py", line 9, in <module> File "/home/kovid/work/calibre/src/calibre/customize/ui.py", line 11, in <module> File "/home/kovid/work/calibre/src/calibre/customize/builtins.py", line 318, in <module> File "/home/kovid/work/calibre/src/calibre/ebooks/epub/input.py", line 9, in <module> File "ExtensionLoader_lxml_etree.py", line 12, in <module> ImportError: /home/philpem/calibre/prebuild/libexslt.so.0: symbol gcry_cipher_setkey, version GCRYPT_1.2 not defined in file libgcrypt.so.11 with link time reference Last edited by philpem; 07-17-2009 at 06:55 PM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PRS-505 Reading .DOC file on PRS 505 | someoneinseattle | Sony Reader | 4 | 07-05-2010 01:05 PM |
Need help converting file which is too long to be HTML | ficbot | Workshop | 8 | 04-06-2010 11:45 PM |
Small HTML file won't finish converting | AlexBell | Calibre | 2 | 07-06-2009 06:15 AM |
converting multi-page HTML to Mobipocket | shinew | Calibre | 13 | 02-21-2009 01:33 PM |
converting lit html output into one big file for BD | Dave Berk | Sony Reader | 15 | 03-29-2007 10:02 PM |