Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 11-02-2018, 02:18 PM   #1
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
New ToC filter for "duplicates" (problem case) - oops

The Aug, 24 2018 Calibre release 3.30 has the following new feature:

"ToC Editor: When generating ToCs using headings/XPath ignore duplicate entries at the same level that have the same text."

This (new) feature is currently non-optional and one circumstance where it causes a problem is with ePubs that have multiple parts and each part re-starts the chapter numbering. If the first part has five chapters (#1-5) and the second has seven (#1-7), then the first five chapters in the second part are excluded when trying to build the ToC.

A ToC generator that focuses only on the chapter numbers will display chapters 1-7 with no indication that there are two parts and that five of the twelve chapters are not displayed.
________________________________________________
ERROR - the above comments refer to the new feature as "non-optional" but as the Ducks points out in later comments below, there is an easy option to turn off the feature.

Last edited by Rob557; 11-05-2018 at 06:01 AM. Reason: belatedly inserting "ERROR" comment at bottom
Rob557 is offline   Reply With Quote
Old 11-02-2018, 03:40 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,346
Karma: 58032210
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Oooo! This is common in Omnibus editions.
I expect that some sort of user smarts would be expected (to turn off this option).
(just another example of why I avoid BULK conversions. It only takes one "I forgot about that" )
theducks is offline   Reply With Quote
Advert
Old 11-02-2018, 04:24 PM   #3
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
I'm not sure the ToC glitch arises as a direct result of bulk conversions (?).

When the ToC for a single ePub (new or converted) is being created (or regenerated to make corrections) using various "edit TOC" functions, there will be a problem if the ePub has multiple parts and each part re-starts the numbering for the chapters.

As an example, if a book has five parts, and the parts have from three to eight numbered chapters each, then the total number of chapters in the ToC will be eight because all the other chapters will be considered to be "duplicative"

Multi-part Omnibus editions would be a problem only where the chapter headers are non-descriptive (e.g. Chapter 1), and therefore would appear duplicative from one part to the next.
Rob557 is offline   Reply With Quote
Old 11-02-2018, 05:31 PM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,346
Karma: 58032210
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Rob557 View Post
I'm not sure the ToC glitch arises as a direct result of bulk conversions (?).

When the ToC for a single ePub (new or converted) is being created (or regenerated to make corrections) using various "edit TOC" functions, there will be a problem if the ePub has multiple parts and each part re-starts the numbering for the chapters.

As an example, if a book has five parts, and the parts have from three to eight numbered chapters each, then the total number of chapters in the ToC will be eight because all the other chapters will be considered to be "duplicative"

Multi-part Omnibus editions would be a problem only where the chapter headers are non-descriptive (e.g. Chapter 1), and therefore would appear duplicative from one part to the next.
It is not the Bulk conversion that cause it.
It is the USER who tosses a book to conversion (singly or in bulk) without considering WHAT needs (and absolutely, doesn't) to be done.
For the most part, I avoid same format (eg Epub to Epub) conversions. I would rather hand edit (I have a lot of code snippets I use) and KNOW what got changed.
theducks is offline   Reply With Quote
Old 11-02-2018, 06:48 PM   #5
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,903
Karma: 27620686
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by theducks View Post
For the most part, I avoid same format (eg Epub to Epub) conversions. I would rather hand edit (I have a lot of code snippets I use) and KNOW what got changed.
↑ ↑ ↑ ✔️

BR
BetterRed is offline   Reply With Quote
Advert
Old 11-03-2018, 02:42 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,303
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
This feature refers to auto-generating tocs from contents. This kinf of auto generation can never be made bulletproof. If the heuristics calibre uses dont work for a particular book, you can always crate the toc by hand.

If a book already has a toc, this is not relevant to it.
kovidgoyal is offline   Reply With Quote
Old 11-03-2018, 12:19 PM   #7
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Quote:
Originally Posted by kovidgoyal View Post
This feature refers to auto-generating tocs from contents. This kind of auto generation can never be made bulletproof.
Hi Kovid. The flexibility you had already built into the ToC generation function is really good. Kudos.

There are a number of circumstances where it is necessary to regenerate a ToC. I can see where the issue of duplicates from the ToC generation process could at times be frustrating for less experienced users (moreso using XPath, whereas I don't recall auto-generation from headers producing duplicates).

The change introduced Aug 24 will be quite helpful for those users less familiar with the ToC generation techniques, but if there is no way for a user to switch off that filter then the change can produce unexpected errors and block efforts to produce a correct ToC.

I'll post a list of some books from which examples can be selected, where the question becomes HOW would the ToC generation process be able to re-generate the ToC for those books. The new filter seems to block any efforts.
Rob557 is offline   Reply With Quote
Old 11-03-2018, 12:23 PM   #8
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Quote:
Originally Posted by Rob557 View Post
I'll post a list of some books from which examples can be selected, where the question becomes HOW would the ToC generation process be able to re-generate the ToC for those books. The new filter seems to block any efforts.
Thomas Mann - The Decline of a Family
Arthur Hailey - Airport
Helen Hollick - Shadow of the King
John Grisham - Rogue Lawyer
Joseph Conrad - Victory
Leo Tolstoy - Anna Karenina
Leon Uris - Mila 18
Norman Mailer - The Naked and the Dead
Stephen King - The Dead Zone

How then would the duplicate-filtered ToC generation process be able to regenerate the ToC for any of the above books (assuming the ePub version you have access to have the same multi-part chapter renumbering as appears in my versions)?
Rob557 is offline   Reply With Quote
Old 11-03-2018, 12:52 PM   #9
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
PS. If the new duplicate-filter could be switched off by a user then it is true that the user may subsequently have to deal with the generation of duplicates within a ToC, but it is generally very easy to simply highlight the duplicates and delete them.
Rob557 is offline   Reply With Quote
Old 11-03-2018, 05:35 PM   #10
Brett Merkey
Not Quite Dead
Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.
 
Posts: 194
Karma: 654170
Join Date: Jul 2015
Device: Paperwhite 4; Galaxy Tab
Quote:
ToC Editor: When generating ToCs using headings/XPath ignore duplicate entries at the same level that have the same text.
Is it true that you cannot turn off this behavior in the newer versions of Calibre? I hope not.

I routinely re-generate TOCs using xpath. Science and history books especially have chapter structure in parts with numbered chapters. It seems unnecessarily tedious to have to hunt and peck to find and recreate missing chapters given how powerful and convenient the Calibre xpath facility is in my older copy of Calibre...
Brett Merkey is offline   Reply With Quote
Old 11-03-2018, 08:01 PM   #11
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,346
Karma: 58032210
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Brett Merkey View Post
Is it true that you cannot turn off this behavior in the newer versions of Calibre? I hope not.

I routinely re-generate TOCs using xpath. Science and history books especially have chapter structure in parts with numbered chapters. It seems unnecessarily tedious to have to hunt and peck to find and recreate missing chapters given how powerful and convenient the Calibre xpath facility is in my older copy of Calibre...
The tick box is still there on 3.33.1
theducks is offline   Reply With Quote
Old 11-03-2018, 10:48 PM   #12
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
oops

Quote:
Originally Posted by theducks View Post
The tick box is still there on 3.33.1
Oops. My mistake. Thank you to the Ducks for pointing out that at the same time the ToC duplicate-filtering was introduced (Aug 24), there was also a tick box option provided at the bottom of the screen display (after selecting the XPath option) labeled "do not add duplicate entries at the same level", so that the option can be turned on or off. Silly me.

In fact I think I ticked that box when the feature was introduced, and then forgot about it when I later encountered the problem case described above. I did an internet search when I encountered the problem and only saw a comment from 'Ben L' in a different forum asking for the feature to be optional ... without realizing (recalling) that the optional feature was in fact built in by Kovid prior to releasing the new feature.
Rob557 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Several questions about the "BBB Filter" Mr.Samuel Kindle Developer's Corner 1 05-02-2016 05:20 AM
My "quest" for a JBL case (Case\Cover/Sleeve\Skin links) tomereader Ectaco jetBook 18 12-02-2011 02:31 PM
Getting calibre to detect "Prologue" and "Epilogue" for TOC sherman Calibre 2 09-20-2010 02:21 AM


All times are GMT -4. The time now is 09:17 PM.


MobileRead.com is a privately owned, operated and funded community.