05-11-2008, 04:38 PM | #1 |
Junior Member
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
|
Book Processor - Anything to LRF and HTML converter
Hi there
I see there are quite a few tools out there to convert all kinds of different files to LRF, but here I come with yet another one It's called Book Processor. It takes a source file as input and can output LRF and HTML. The source file can be created by hand, or from the original input file (as long as you have a program capable of reading the input file). You can find the application, the documentation and an example book at http://stuf.ro/bp/ The project is in a pretty early stage, so do expect bugs. Radu |
05-11-2008, 05:12 PM | #2 |
creator of calibre
Posts: 44,409
Karma: 23977332
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Welcome to Mobileread. May I ask what missing features in existing converters are you trying to supply?
|
Advert | |
|
05-11-2008, 05:12 PM | #3 |
Wizard
Posts: 2,624
Karma: 1008294
Join Date: Dec 2007
Location: Iowa, USA
Device: Nook Simple Touch
|
ok will give it a try, thanks
|
05-11-2008, 06:04 PM | #4 | |
Junior Member
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
|
Thank you
Quote:
There are some languages in which dialogs are represented by a line beginning with a dash, followed by the actual dialog, and, since the Reader always justifies text, the space between the dash and the first sentence usually has a variable length. You can of course replace the space with a nonbreaking space manually for each document written in a language that uses these typographical conventions, but I'm trying to automate such tasks. The idea behind the project is to automate as much as possible and to always obtain a consistent result, regardless of the way the input file looks. To that end, I also implemented some more advanced features that can be used to automate book organization, without any interventions in the input text. For example, you can separate chapters by using a regexp or you can automatically transform chapter names with a single instruction (check out the example book in which all chapter names are uppercase). This is something you would normally do manually. There's also the organization of chapters in "parts", something that is present in most large books, but I couldn't really find in any of the LRF convertors that I tried (you can always use a hack, such as an empty chapter, as a part separator, but I'm not very confortable with that). There are some more features that I wanted but couldn't find (such as automatic OCR fixing and footnotes). Check out the feature list in the manual for a complete list of features... There are some that already exist in other implementations and some that I believe have never been implemented so far (at least to my knowledge). |
|
05-11-2008, 06:25 PM | #5 |
zeldinha zippy zeldissima
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
|
hello Little Dragon, some of the features you mention sound quite interesting to me, are you planning to support output in any other format besides lrf and html in future (like .prc / mobi) ?
|
Advert | |
|
05-11-2008, 06:29 PM | #6 |
Junior Member
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
|
I can only test LRF so far, since I only have a PRS-505... I also plan to support other languages besides English and Romanian, but I have to get up to date with their typographical conventions.
|
05-11-2008, 06:41 PM | #7 |
zeldinha zippy zeldissima
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
|
thanks for the answer ! just so you know, you can use an emulator to test other formats, if you like (i know they are available for mobi .prc and for .imp, and probably for all others as well). i will definitely keep an eye on your project ; the format i use is .imp, which is based on html but is a "dead-end" format (you can't convert it to anything else), but i think when i make Project Gutenberg books to upload here it would be nice if i could also make a .prc version, since it can be read by a lot more people and easily converted if necessary. and so far i have not been very satisfied with the programs i have used for preparing my texts.
|
05-11-2008, 07:10 PM | #8 | |
Junior Member
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
|
Quote:
One of the goals of the project though is consistency and I'll defenetly check out existing libraries and emulators for other formats. I think I could initially approximate the difference in font size by the difference in resolution and actual screen size... In any case, I still have quite a bit of tweaking left to do... |
|
05-12-2008, 05:06 AM | #9 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
You can download the Windows MobiPocket Reader from http://www.mobipocket.com to test out MobiPocket books.
|
05-12-2008, 05:44 AM | #10 | |
Junior Member
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
|
Quote:
|
|
05-12-2008, 03:05 PM | #11 |
Junior Member
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
|
New version: 0.1.1
Changelog: - rewrote the quote detection algorithm to take in account possible OCR errors - added stderr warnings where unbalanced quotes are detected - adjusted the OCR fixing heuristics |
Tags |
html, lrf, prs500, prs505, reader |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Book Processor | Adair | Introduce Yourself | 10 | 10-06-2010 09:27 AM |
Problem Converting Book Designer HTML to LRF | Phonella | Calibre | 6 | 10-22-2009 01:21 PM |
CBZ > LRF (LRF>HTML/MOBI????) | sideburnt | Calibre | 4 | 09-15-2009 06:44 AM |
Yet Another Gutenberg Book/HTML converter | FangornUK | Sony Reader | 59 | 05-01-2009 10:15 AM |
PRS-500 Linux based HTML to LRF converter? | Thiana | Sony Reader Dev Corner | 3 | 04-08-2007 02:34 AM |