11-17-2012, 11:32 PM | #1 |
Zealot
Posts: 110
Karma: 972092
Join Date: Jan 2012
Device: iPhone
|
Making Table of Contents on free eBook files from Archive.org
The books that I read are free .ePub /.mobi files downloaded from Google Books and Archive.org. But it downloads with terribly constructed table of contents with most chapters missing.
What would be the best way of putting in the bookmarks in these files? Once, I used Sigil to input the table of contents and from what I remember it was a bit inefficient. There was two many aim and clicks from what I remember. Is anybody familiar with the inefficiency of the Sigil table of contents making and can suggest a more efficient, faster way? |
11-18-2012, 12:17 AM | #2 |
350 Hoarder
Posts: 3,574
Karma: 8281267
Join Date: Dec 2010
Location: Midwest USA
Device: Sony PRS-350, Kobo Glo & Glo HD, PW2
|
I find Sigil's TOC to be very nice and I always use when cleaning up horrible looking ebooks. The only requirement is that the chapters need to have a header tag assigned such as <h1> and not just be a regular <p> font style, and badly formatted books usually are not consistent in those headers. It's the only way Sigil (or any program) has to know what is a chapter and what isn't. Then when you click on "Generate Table of Contents" it will pick up any and all chapters with any <h> tag. Sometimes they're nestled or sometimes a blank line will pick up the <h> tag, just uncheck the ones you don't want to show in the TOC. Then select Tools, TOC, Create HTML TOC and done.
Sometimes the original book is done so badly though you might need to do some editing beyond just checking for <h> tags. I've found some chapters will have a title in them, making a longer chapter name, and others in the same book do not. You'll have to manually correct each chapter to read the same, whether you want to include the longer chapter names or keep it simply to "Chapter xx". That's not a shortcoming of Sigil, it's just a badly formatted book where you need to correct another person's laziness. Last edited by Ripplinger; 11-18-2012 at 05:23 AM. |
Advert | |
|
11-18-2012, 08:00 AM | #3 |
Color me gone
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Calibre guesses where the chapters are and Sigil does not, so that is not so time saving. But calibre can guess wrong. As Ripplinger noted, you can search for "chapter" with search and mark them with a h tag and be done in 5-10 minutes, for that. Actually you can stay in book view and just scroll down the page and highlight and press the h tag buttons. While in the process, please press control enter at the end of each chapter so it will work on my Sony Reader and not be terribly slow on every else's.
BUT that is just the beginning with the OCRed material at archive. It is necessary to go through the whole book and check for what will be hundreds of errors and even whole sections out of place, especially if the book is laid out in columns and has illustrations. R will be E and e will be c, etc. Footnotes that run over one page also give headaches for non fiction books. By the time you do your job well in converting them, you won't want to read the book for 6 months because you will have been over it so many times. Even if someone has done a lot of hard work, such as at Hyperwar, you will still spend a long time picking up after their work. Some have an extreme affinity for ordered lists and turn abc into 123 and 12345678 into 12341234 because they happened to hit a page break. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Table of Contents Control for Mobi files | buzzby | Conversion | 1 | 09-29-2012 01:48 PM |
adding table of contents to html files | jfs999 | Conversion | 2 | 09-30-2011 03:25 PM |
Archive.org opens huge ebook lending library | rogue_librarian | News | 37 | 02-27-2011 09:16 AM |
Ebook Creation: Table of Contents | maurices5000 | Kindle Formats | 15 | 02-18-2011 02:58 AM |
How to review highlights/notes on PDF's Files that not have table of contents | skydive | PocketBook | 12 | 12-30-2010 09:26 AM |