04-13-2011, 07:07 AM | #1 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Chapter Detection/Table of Contents Tutorial
Getting Calibre to appropriately detect Chapters and build a Table Of Contents (TOC) sometimes requires some relatively simple examination of your book's html source code. This is required for situations where your source ebook format lacks a TOC and you want to be able to navigate Chapters on your Reader - e.g. the TOC viewer on epub readers, or the Kindle's 5-way controller and inline TOC.
Calibre has default settings for detecting a Table of Contents, and while this will autodetect the TOC for some books, for many it just won't work. You'll need to get your hands dirty and look at html and edit the defaults. User Manual Links for Conversion: Before you get started/Common Mistakes If your ebook already has a well defined TOC then Calibre can convert this without a problem. A common mistake that users make is enabling the 'force use of auto-generated table of contents' under the conversion options for Table of Contents. This will cause Calibre to throw out the existing Table of Contents and require that you laboriously follow this tutorial for every book. Only enable that option if you know your book already has a bad Table of Contents and you follow this tutorial with the intention of having Calibre make a better one. Kindle/Mobi Specific Issues The Kindle has two types of TOCs. One is an NCX file similar to an epub TOC. This is used to create the TIC marks on the Kindle's progress bar. The other is a human readable TOC that is linked to from the Table of Contents button/menu. Read here for further details on Kindle TOC support. Calibre will create both types of TOCs during conversion if properly configured. Read this post for a solution to avoid two user visible TOCs or to re-use a book's existing user visible TOC. Step One, Research your Book Under the conversion options, go to Search and Replace. Click one of the magic wands on the right half of the screen. If you have multiple source formats Calibre will ask you to choose one - be sure to choose the correct one. Your book's html code will pop up in a new window. Start scanning through the html code for your chapter headings. You can generally find one quite easily, but if you're having trouble try searching for the plain text that you see when viewing the chapter heading in a ebook reader/web browser. There are two basic situations you'll run into at this point - the book has clearly defined chapter headings, or it doesn't. There are different ways of handling each case. Well defined chapter headings: A well defined chapter heading will typically have code that looks something like this: Code:
<div class="chapter"></div><div> <h3><a name="ch05" id="ch05">5</a> <br /><br/><br /></h3> </div> <p class="fl1">My nagging got the better o Code:
<h3 class="calibre6"> <a name="ch05" class="calibre9" id="ch05">5</a> <br class="calibre3"/><br class="calibre3"/><br class="calibre3"/> </h3> There is a box in the structure detection panel of conversion where you can configure an xpath to detect chapters, the default is this: Code:
//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\s+', 'i')) or @class = 'chapter'] So we can just change that xpath to this: Code:
//*[((name()='h1' or name()='h3') and re:test(., '\d+', 'i')) or @class = 'chapter'] To match everything in an h3 tag: Code:
//*[((name()='h1' or name()='h3') and re:test(., '.*', 'i')) or @class = 'chapter']
Poorly defined chapter headings: Here's an example of a poorly defined chapter heading: Code:
<p class="MsoNormal" align="center" style="mso-margin-top-alt:auto;mso-margin-bottom-alt: auto;text-align:center;line-height:normal"> <span style="font-size:14.0pt; font-family:"Times New Roman";mso-fareast-font-family:"Times New Roman"; color:black"></span> </p> <p class="MsoNormal" align="center" style="mso-margin-top-alt:auto;mso-margin-bottom-alt: auto;text-align:center;line-height:normal"> <span style="font-size:14.0pt; font-family:"Times New Roman";mso-fareast-font-family:"Times New Roman"; color:black">Chapter 2</span> </p> <p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto; line-height:normal"> <span style="font-size:14.0pt;font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";color:black">The incredulous look must have been plain on my face. As she realized how her offer sounded, her Often the simplest solution for this type of chapter heading is to go into the Heuristic Processing panel of Calibre's conversion options and enable Heuristics. Heuristics will search for common types of chapter headings and wrap them with <h2> tags. Now you can go into structure detection, click the magic wand next to the Chapter detection xpath, and just type '//h:h2' into the first box. Calibre should create a table of contents for this type of scenario. Image Only Chapter Headings Some books only use images for Chapter headings. This can be difficult to handle and may require hand editing in Sigil by converting to epub first. Basically an image heading might look like this: Code:
<p class="sb-chapter-image"> <span class="chapter-image"> <img alt="Alice_01.tif" class="generated-style-2" src="images/Alice_01_fmt.jpeg"/> </span> </p> Code:
<h2 class="sb-chapter-image" title="Chapter 1"> <span class="chapter-image"> <img alt="Alice_01.tif" class="generated-style-2" src="images/Alice_01_fmt.jpeg"/> </span> </h2> Nothing worked, I'm getting Desperate If none of the above solutions for you is working, convert to epub and edit your book in Sigil. During the conversion in Calibre it's a good idea to go into Calibre's conversion settings and temporarily change the 'Split files larger than' option under 'Epub Output' to 3000 or larger (depending on how large your book is - change this back when you're done). Using Sigil you can mark your Chapter headings manually (or possibly using Sigil's search and replace). Once you've finished, use Calibre to convert your new epub to your desired destination format - Calibre will preserve the TOC that was created by Sigil when it converts to the new format.
Last edited by pdurrant; 01-27-2017 at 05:41 PM. Reason: updated links |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Chapter Detection Tutorial | ldolse | Conversion | 34 | 01-11-2012 07:32 PM |
Help with Chapter detection | ubergeeksov | Calibre | 0 | 09-02-2010 05:56 AM |
Repeated Chapter Headings in Kobo Table of Contents | capsolo | Sigil | 5 | 06-20-2010 04:09 AM |
Chapter detection for LRF | HenryP | Calibre | 12 | 04-03-2009 09:22 AM |
First chapter of table of contents not working | Amalthia | Calibre | 3 | 09-11-2008 03:46 PM |