05-26-2007, 03:53 PM | #61 |
Blueberry!
Posts: 888
Karma: 133343
Join Date: Mar 2007
Device: Sony PRS-500 (RIP); PRS-600 (Good Riddance); PRS-505; PRS-650; PRS-350
|
Pielrf - Text to LRF Tool
HarryT suggested I post a little info about my new utility pielrf for converting e-books in text form to LRF format. I designed pielrf with Guteberg and OCR books in mind.
Pielrf duplicates the look of Sony Connect EBooks, while incorporating ease of use, and full control over options. It includes much-requested features easy table of contents, chapterization, top-of-page headers, and curly quotes. Most books only take a few minutes of preparation. You can download pielrf from the follwing thread. Here's a few basics RUNNING THE PROGRAM pielrf -i infile.txt -o outfile.lrf -t "Book Title" -a "Author Name" This will generate a basic LRF file, with a single "Table of Contents" entry, flowed paragraphs, curly-quotes and page headers.EDITING THE INPUT FILE If you want a Table of Contents and Chapters, it takes just ONE WORD(!!) in your book's text file! <chapter> If your book is like this...Chapter One ...Some Text ... Chapter Two ...Some More Text <chapter>Chapter One ...Some Text ... <chapter>Chapter Two ...Some More Text That's the only editing needed, and the rest is done automatically. One Addition tag allows you to add text and vertical spacing on the TOC page. <toctext> Putting it on a line by itself adds a blank line to the TOC. Following it with text, similar to "<chapter>," adds that text to the TOC page.HTML FOR TYPOGRAPHY Unlike BookDesigner, you need to edit the input text file separately and add typographic HTML tags by hand before running pielrf. While, for example, BookDesigner lets you highlight text and select "Italics," here, you would add the HTML tags "<i>" and "</i>" yourself. The file does not have to be actual HTML (although it can be), you can add these tags to any plain text file. Recognized Tags <CENTER></CENTER> - Centered Text <I></I> - Italics <B></B> - Bold <SUB></SUB> - Subscript <SUP></SUP> - Superscript <BR> - Line Break (Vertical Whitespace) <P> - Paragraph (Use With "-b html" Command Line Switch) <H1></H1> <H2></H2> <H3></H3> - Heading (Bold+Large Font) Tags (all resolve to same font size) You can combine tags like "<center>" and "<h1>" for example, with the only limit being bold and italics -- its one or the other, so you won't get bold-italic text. FEATURES + Table of Contents Menu and Page via the <chapter> tag. + Top-of-Page headers + Curly (typographic) quotes. + Paragraph auto-flow. + Can make whole book Bold to increase contrast. + Understands HTML tags <i></i>, <b></b>, <center></center>, <sub></sub>, <sup></sup>, <p></p>, <h1></h1>-<h3></h3>. + Understands ALL HTML Ampersand tags - &, £, üat, etc. + Paragraphs can be delimited by tabs, spaces, vertical whitespace. + Font size / weight (bold) can be controlled from command line. + Ability to control almost everything else from the command line too! OPTION OVERLOAD You can control just about every option regarding layout. The defaults should work just fine, and all you need to provide are the input and output files, and a title and author. But to see the options just type: pielrf -h I provide a couple of examples in the next post.-Pie Last edited by EatingPie; 05-26-2007 at 09:29 PM. Reason: Sheesh, I'm typo king. |
05-26-2007, 03:56 PM | #62 |
Blueberry!
Posts: 888
Karma: 133343
Join Date: Mar 2007
Device: Sony PRS-500 (RIP); PRS-600 (Good Riddance); PRS-505; PRS-650; PRS-350
|
Here's two examples from Gutenberg: War of the Worlds and Utopia. Utopia took me about 5 minutes to prepare, moving the license and adding "<chapter>" tags. War of the Worlds was only slightly longer, but I also used the "<toctext>" tag.
pielrf -i utopia.txt -o utopia.lrf -t Utopia -a "Thomas More" pielrf -i waroftheworlds.htm -o waroftheworlds.lrf -t "War of the Worlds" -a "H.G. Wells" -b html --headerstyle=titlechapter I've included both the original (edited) file, and the final LRF.-Pie This work is assumed to be in the Life+70 public domain OR the copyright holder has given specific permission for distribution. Copyright laws differ throughout the world, and it may still be under copyright in some countries. Before downloading, please check your country's copyright laws. If the book is under copyright in your country, do not download or redistribute this work.
To report a copyright violation you can contact us here. |
Advert | |
|
05-26-2007, 03:59 PM | #63 |
Lovin' the e-book life...
Posts: 633
Karma: 2509
Join Date: Nov 2006
Location: Colorado
Device: Ebookwise 1150, Sony PRS-505, Amazon Kindle, BeBook (with OpenInkpot)
|
Can this be used to place images in the .lrf file as well?
|
05-26-2007, 04:45 PM | #64 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No for that you have to use html2lrf which supports a much larger subset of HTML including images and tables and links...
|
05-26-2007, 09:35 PM | #65 | |
Blueberry!
Posts: 888
Karma: 133343
Join Date: Mar 2007
Device: Sony PRS-500 (RIP); PRS-600 (Good Riddance); PRS-505; PRS-650; PRS-350
|
Quote:
I notice that HarryT likes to put images on the book title page. Is this the typical use? Or did you want to put images interpersed throughout the book -- with, say, text flowing around (above/below) the image? Say, like an illustrated chapter book. -Pie |
|
Advert | |
|
05-26-2007, 09:46 PM | #66 |
Technogeezer
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
|
While graphics make for a better cover, many of the books I have worked on are all text and graphics are of no importance.
That said, many of the current books I am working on for the near future are graphic intensive. Some have just ornate drop caps at the start of chapters while others are a tight combination of text and graphics. One, A Child's Garden of Verses, has many graphics that need to fit together tightly to produce the finished pages. (Even html2lrf is having trouble with this at the moment even when autorotate is turned off.) Others like the Beatrix Potter series are simply alternating text and graphics without any side-by-side of text and graphics. So yes, the integration of graphics is a big deal for any tool that is used to produce LRF books. |
05-27-2007, 03:04 AM | #67 | |
Blueberry!
Posts: 888
Karma: 133343
Join Date: Mar 2007
Device: Sony PRS-500 (RIP); PRS-600 (Good Riddance); PRS-505; PRS-650; PRS-350
|
Quote:
The reason I asked about flow was because my abortive attempt only produced images at the end of the chapter (or page) -- not very flow-friendly. That didn't seem worthwhile, so I punted. I'd love to give it another shot, but I'd like to see what users are looking for so I can take that into account ahead of time. That said, my main focus has been ease of use in producing "text" books, so images may be at cross purposes for pielrf. -Pie |
|
05-27-2007, 03:35 AM | #68 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Being able to incorporate images into the book is very important for some types of book - eg, if you look at the "Philo Vance" detective stories I've posted recently, you'll see that they have all sorts of diagrams in them such as plans of the building where the crime took place, and the books would lose a lot without them.
I don't personally see the need to "flow" text around images as being important - simply being able to put images on the page "between" two paragraphs would suit me fine. BD does support text flow - you can put an image on the left side of the page and have the text carry on to the right of it - but in cases where I've found that in the books I've converted, I've "unflowed" it - ie cut the picture from the text and re-inserted it at the end of the paragraph. On the Reader's small screen, I don't think that "flowed" text look especially good. Just being able to have: <paragraph of text> <picture> <Another paragraph of text> is, however, I think pretty vital! In reality, because of the size of the Reader's screen, large-ish images do generally end up either at the start of a page, or at the end, or, indeed, occupying a page by themselves, but I certainly think there should be the option of having images in the middle of the page. When I did the "Sherlock Holmes" stories, for example, it was vital that the "Dancing Men" graphics be able to be inserted mid-text. |
05-27-2007, 12:59 PM | #69 |
Evangelist
Posts: 482
Karma: 7696
Join Date: May 2007
Location: Turner, Oregon
Device: Sony Reader
|
i'm trying to find a way to define and edit images to the right size for the page also.. hhmmm...
Last edited by Roy White; 05-27-2007 at 01:03 PM. |
05-27-2007, 01:10 PM | #70 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
With BD, once you've dragged the image onto the page, just double click it. A dialog pops up on which you can set the size of the image.
|
05-28-2007, 01:42 PM | #71 |
Evangelist
Posts: 482
Karma: 7696
Join Date: May 2007
Location: Turner, Oregon
Device: Sony Reader
|
Ok. Harry, Here's a trick I accidentally found out the other day after I used the technique you defined in your tutorial to change the Chapter headings from Chapter 1 chapter 2 etc, To the real chapter headings... You should change your tutorial to this way since it saves hours... What I do is load the text, clean the top and bottom of the Gutenberg legalese junk, add the name of the book and author, the contents, then use the element browser under tools to find the "Titles. usually they will be Chapter one chapter two etc. Wiht the real chapter titles assigned as subtitles under each one) Sometimes a chapter or two will not be assigned as a title.. so i click on the chapter that is before the missing ones and scroll to find the chapter that isnt assigned as a title.. Then highlight it and make it a title using the corrector panel, it will appear in the element browser. After I make sure all the chapters are there, I go though the chapter list one by one, (By simply clicking on the 'chapter one' list in element browser (The text will jump to that point int he book an I can check to make sure the real chapter title appears below each 'chapter' title. Sometimes the real chapter title is there but not assigned a sa subtitle so i highlight TAHT and use the corrector panel to change that to a subtitle. This way im sure the real chapter titles wil be there when i use element browser to find the subtitles, the REAL chapter titles, the text just below the chapter title 'chapter 5' etc are assigned as subtitles. So.. I click through the subtitles and make sure that every one has the words chapter 8' or whatever above them., Sometimes element browser will assign other text as subtitles so I just use the 'verse' tool and highlight any text that is assigned as a subtitle that you dont want to eventually be the real chapter titles as a verse. (It will deselect those words as subtitles and make them a verse, or whatever you want to do to deselect that text as a subtitle. Once you have all the real chapter titles assigned as subtitles you can (If the text doesnt already have the chapter titles listed in a table of contents.. Open a notepad file and type in all the chapter titles by having element browser find the subtitles. (Remember by now they are the real chapter titles.. Then copy and paste the notepad list to the 'contents page at the top of your book... You can use element browser to find the titles and they will be 'Chapter one' chapter 2 etc. Now highlight the top one on the list in element browser hit shift, select the bottom one. it will highlight the entire list, then use the pull down list at the bottom of the element browser tool where it says, reformat selected elements as.. and you go to delete.. hit button,, all those 'chapter 7 etc titles are gone! Then use the element browser to find subtitles (Your real chapter titles Highlight all, then use the reformatting tool at the bottom of the element browser window to change all THOSE to Titles. Walla! You have as your chapter titles all the real titles and have rid yourself of the "Chapter 4" text in about 2 minutes flat instead of scrolling through the text and doing it lal manually. I saved an hour doing this on Les Miserables after the program crashed the other day and i lost an hour of tediously doing this manually without saving my work.
You can then use element browser to find all empty paragraphs, it wont find real paragraph breaks just empty ones where someone hit return twice or sometimes html text or txt files are formatted with more than one line break between paragraphs. ... Find empty paragraphs, select all by clicking top one and bottom one with shift held down... use pull down reformatting tool to delete all those and all the text in the book will be nice and tight and tidy.. Sweet trick I hope that was as clear as mud. Last edited by Roy White; 05-28-2007 at 01:54 PM. |
05-28-2007, 05:20 PM | #72 | |
Technogeezer
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
|
Quote:
Since I took the raw text from their site it is only fair that they get the credit they earned. It is also part of the terms of the agreement we agreed to when we downloaded from their site. I am sure that this is what you mean that you do because I know that you are not the sort of person to cut out their credit altogether. |
|
05-28-2007, 09:09 PM | #73 |
Evangelist
Posts: 482
Karma: 7696
Join Date: May 2007
Location: Turner, Oregon
Device: Sony Reader
|
Oops. I never read it. I guess thats what it says Huh? From now on I'll include it.
DDDOOOHHH!! |
05-29-2007, 03:53 AM | #74 | |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
1. Remove all mention of PG from the book. or: 2. Include the full licence agreement. I personally choose option 1. I freely acknowledge that PG are a wonderful resource (and I do a fair amount of proofreading for them myself), but I just don't like the 20 pages of legal "stuff" in the books that I create. |
|
05-29-2007, 04:00 AM | #75 | |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
What you're saying, if I understand you correctly, is that you prefer to remove "Chapter 1" and replace it with "Harry Writes a Book" (or whatever the "real" chapter title, as you put it, is). That might work for "pulp" fiction that nobody really cares about, but I certainly wouldn't do it for a "classic". Suppose I read a book in which I see "In chapter 53 of Les Miserables, Hugo says....". How do I find out what chapter 53 is in your version of the book which lacks chapter numbers? We obviously all prepare books the way that we think best, but personally I feel that preserving chapter and book number for "serious" literature is absolutely essential. What do other people think? |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Book Creation Tutorials | HarryT | Kindle Books | 76 | 09-21-2010 01:17 PM |
Help: Tips & Tutorials on how to debind, seperate pages & scan a hardback book to PDF | thebigalphamale | Workshop | 4 | 04-17-2010 02:41 PM |
I am retiring from the book creation business. | Madam Broshkina | Upload Help | 58 | 02-21-2008 10:34 AM |
PRS-500 Can Book Creation support Japanese | nidecta | Sony Reader Dev Corner | 6 | 08-09-2007 08:40 PM |
E-book creation links | Bob Russell | Workshop | 1 | 08-23-2006 08:06 AM |