05-21-2011, 12:29 PM | #1 |
Calibre Plugins Developer
Posts: 4,692
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
[GUI Plugin] Count Pages
This plugin will determine a number of pages and/or words in a book and store the result in custom column(s). In addition to just general library browsing usage, Kindle users can generate APNX files using the value from a pages custom column. So when you send an ebook to your Kindle device from calibre, you will have page numbering available similar to that when loading Amazon books which offer this feature.
You have two overriding methods of determining page count with this plugin. The first approach is estimation based on the book content, provided you have an ePub format or a format that is convertible to ePub. The format used if your book has multiple is chosen based on your Preferred Input Format order, that you set in Preferences -> Behavior. Note that if you use this option it can be an approximation only of a paperback edition due to differences in fonts, images, layouts etc. By default it uses an "accurate" algorithm similar to that created by user_none for generating APNX files for Kindle users. Alternatively in the configuration you can choose to use the page count used by the calibre e-book viewer, or you can use the Aobe algorithm used by their ADE software and some devices like a Nook. However if the format being counted is a PDF, there is now a special optimisation to read the actual page count rather than estimating it using any of the above algorithms. The second page count option is to download the page count from a web page on the Goodreads.com website for your specific linked edition. This can be used for a book with any formats (or even none). How is a goodreads identifier linked? Either by using the Goodreads metadata download plugin, the Goodreads Sync plugin, or by manually typing a goodreads:xxx id into your identifiers field for the edition of interest. If the edition you have linked to has no page count, you can switch editions using a feature added to the Goodreads Sync plugin. Word count is optionally calculated independently of page count. As this is unavailable on a website, it is subject to the same limitations as estimating page count above, in that you must have either an ePub or a format convertible to ePub available for it to work. Finally a variety of readability statistics have been added which you can optionally calculate such as Flesch-Kincaid and Gunning Fog index. Main Features:
Special Notes:
Paypal Donations: Last edited by kiwidude; 04-07-2024 at 02:19 AM. Reason: New version |
05-21-2011, 09:49 PM | #2 |
Sigil & calibre developer
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
If you have thoughts about adding support for MOBI have a look at apnx.py in the Kindle device directory.
|
Advert | |
|
05-21-2011, 10:43 PM | #3 |
eBook Junkie
Posts: 1,526
Karma: 1464018
Join Date: May 2010
Location: USA
Device: Kindle Fire 2020, Kindle PW2
|
|
05-21-2011, 11:15 PM | #4 |
Addict
Posts: 272
Karma: 1050426
Join Date: Feb 2010
Location: California
Device: iPad Mini w/Retina, Kindle 3, Kindle Fire HDX 8.9, & Asus Transformer
|
Add me to the vote.
|
05-22-2011, 10:29 AM | #5 |
Calibre Plugins Developer
Posts: 4,692
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
I did initially wonder about supporting other formats, but the immediate question becomes how would it work, when you have a book that has multiple formats? Say you have a book that has an ePub and a mobi version. Which would it choose to calculate on? Should there be a user configurable preference list (a bit like the input format order?).
And then if using the apnx code, clearly there is the two algorithms - should the choice be hard-coded, user configurable, or use the Kindle driver setting? It opens questions I chose to avoid for the sake of a quick plugin that was primarily a technical experiment as a precursor to the new version of Extract ISBN... That is not to say I wouldn't be willing to make the changes to make it more useful to others, clearly from the posts above there is interest. I just would appreciate some input as to how you would like to see it working. For myself I only store ePub and mobi versions so always have an ePub to "calculate" on. I guess it will be interesting to see how the numbers differ on user_none's page calculations versus the simple calc done on ePubs for the viewer. |
Advert | |
|
05-22-2011, 11:01 AM | #6 |
Sigil & calibre developer
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
The fast APNX algorithm will give approximately half of the page count as the viewer. It counts every 2300 characters as a page. I believe the viewer uses 1024. The more accurate APNX algorithm will be substantially different because it only looks at visible characters and checks for paragraph tags / length.
|
05-22-2011, 01:50 PM | #7 |
eBook Junkie
Posts: 1,526
Karma: 1464018
Join Date: May 2010
Location: USA
Device: Kindle Fire 2020, Kindle PW2
|
I cannot comment on the technical aspect, however as a user I would like it to be user configurable as to which format should be the default when more than one exists. While I have both mobi and epub in my library, mobi is my primary format so that is the one that I would like to use. Unfornuately I am not able to use the apnx plugin because the k2 does not support it.
|
05-22-2011, 02:11 PM | #8 |
Calibre Plugins Developer
Posts: 4,692
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@user_none - thx for that info. Yes I must confess in my random sampling of ePub page counts I found the published value was between 50% and 65% of the Calibre calculated value. So that sounds like using your mobi calculations as the "first choice" would be a nice default. I guess the problem is that means you would get very inconsistent results in your library though, depending on which format you chose. And that would rather negate the intent of the plugin, allowing you to compare at a glance books to see if you wanted a "quick read" versus something to get stuck into.
Perhaps the solution to that is to not use Calibre's page count from the EBookIterator, and to instead replicate your apnx approach for ePubs. What are your thoughts on that - is that feasible in your opinion? @Nyn - thx for your thoughts. Provided both ePub and mobi used "similar" algorithms, then hopefully it should be mostly immaterial which you ran (assuming they are a conversion of the same edition of course). |
05-22-2011, 09:15 PM | #9 |
Calibre Plugins Developer
Posts: 4,692
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
v1.0.1 Released
Changes in this release:
Thanks to all for the feedback above. Note that I have decided to change the ePub count away from the Adobe "1 page = 1024 chars" that the ebook viewer uses and instead apply an algorithm very similar to that created by user_none for generating APNX files for Mobi formats. So you should now get roughly similar numbers depending on whether you scan Mobi or ePub formats. You can choose which to prioritise in the configuration dialog. I've added some extra logic in the ePub counting that the APNX mobi code does not have. So you might find for a minority of "weird" books internally that the ePub count offers a more consistent result. It is a luxury I have that user_none did not for his purposes of not sacrificing performance while still offering a useful solution to Kindle users. However for the most part either should give you comparative results, and thanks to user_none for his starting point and information. In my own quick sampling I found that the algorithms generally tend to slightly underestimate the pages compared to paperback versions, but there were exceptions where the reverse was true. Also there is inconsistent results on printed page counts as well - hardback vs paper vs ebook vs large print editions for instance, fonts, line spacing etc. So treat the numbers from this plugin just as a general indication of relative size and have some fun |
05-22-2011, 10:38 PM | #10 |
eBook Junkie
Posts: 1,526
Karma: 1464018
Join Date: May 2010
Location: USA
Device: Kindle Fire 2020, Kindle PW2
|
Just one tiny thing, without realizing it I selected my clippings file to have a page number calculated. This "book" is only in a text format, I have the mobi format selected as my preferred format. The error message said it could not calculate the page count because no epub version existed, which is correct. However, would it be better to have the error message state the preferred format didn't exist, rather than epub?? Just a question.
|
05-23-2011, 03:57 AM | #11 |
Wizard
Posts: 1,770
Karma: 30063305
Join Date: Dec 2006
Location: Singapore
Device: Boyue
|
The page count is still about 30% more than the hardcover and paper back written in Amazon will it be possible to get those numbers. I would really prefer those numbers or the calculated numbers to be close to that.
I do know that for some books the count wont be that close as sometime the typeface used might be bigger. But for the 10 or so book I compared the count was consistently around 30% more that for the hardcover. I do read a lot of fantasy and scifi and usually the type face is more or less the same for those books so maybe thats why I got such a consistent result. If that's not possible would it be possible to make the page count user configurable so maybe we can specify how many words or letters would signify as one page. |
05-23-2011, 04:43 AM | #12 |
Wizard
Posts: 1,770
Karma: 30063305
Join Date: Dec 2006
Location: Singapore
Device: Boyue
|
Ok forget about my previous post as I don't think you can get a very accurate result. The 10 books I had tried before were the largest by page from earlier page count and all were large fantasy and scifi books. With the updated plugin the count for all those books was about 30-35% more than the stated on amazon for harcover and paperback.
But the count for other genres like romance the count is less by about the same or more so this count is good for only comparing ebook libraries. |
05-23-2011, 07:58 AM | #13 |
Calibre Plugins Developer
Posts: 4,692
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@dopedangel - like I posted above there is no way for it to be particularly accurate so don't get hung up on trying to compare with actuals. Think about all the differences between books on your shelf in terms of size of fonts etc. In my own testing I didn't find the variance to be that large but of course it can happen.
|
05-23-2011, 11:13 AM | #14 |
Calibre Plugins Developer
Posts: 4,692
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
v1.0.2 Released
Changes in this release:
Thx Nyn for the reminder on this. |
05-24-2011, 05:29 AM | #15 |
Addict
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
NIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIICE
My books show up as 50% of pages shown in the viewer. But as an indication this is a huge upgrade compared to file size. And again a great plugin! |
Tags |
count, count pages, page count, pages, plugin |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Quality Check | kiwidude | Plugins | 1226 | 11-22-2024 09:27 AM |
[GUI Plugin] Open With | kiwidude | Plugins | 403 | 04-01-2024 09:39 AM |
[GUI Plugin] Quick Preferences | kiwidude | Plugins | 62 | 03-17-2024 12:47 AM |
[GUI Plugin] Kindle Collections (old) | meme | Plugins | 2070 | 08-11-2014 01:02 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 01:27 PM |