11-13-2022, 02:34 PM | #1 |
Member
Posts: 13
Karma: 1138306
Join Date: Mar 2018
Device: none
|
Print Page Approximator for EPUB and EPUB3 v1.1.8
After years of reading ebooks with reflowing page numbers and reading progress measured in percents, I discovered KOreader's support for print/reference pages and quickly got hooked. Now I'm one of those people who expect ebooks to retain the page numbering of the print version regardless of font or screen size and it distresses me how few ebooks seem to use the pageList feature.
It distressed me even more that there was no quick way to add a page list yourself to ebooks that don't have it. - I don't use ADE and don't think that a single page detection algorithm fits all books. - Calibre can approximate page numbers when converting to kfx but does not offer any option to get those numbers back into an epub. So I started developing my own tool for the job and I think it's at a point where it's fully usable and produces good results. Print Page Approximator is a simple command line utility and using it to paginate a book is as simple as this: Code:
.\page_approximator.exe .\example_book.epub 150 As of version 1.1.5 the tool also supports calculating a custom page count based on book contents (characters/words/lines). Otherwise, it takes any page count you want and calculates page-breaks based on that. And as of version 1.1.8 you can also "upgrade" books that have non-standard page markers, converting said markers to working print reference pages with page-list entries. For those who want finer control over how page breaks are generated there are quite a few advanced options available, among them are:
The output of this tool is spec compliant for both the pageList in EPUB2 and the page-list nav in EPUB3, so if a device supports pageList normally, there should be no problem. Important: For devices/apps that only support an the adobe version of pageList, "page-map" (apparently this includes the standard reader on Kobo, thanks to @Sirtel for testing this) an additional page map file can be generated by appending the flag --page-map. Personally I don't really have any way to test the results outside of KOreader, so I'd really appreciate feedback about how well page support works on different devices. I am aware that some might question the point of generating an arbitrary and inaccurate approximation of an already arbitrary and inconsistent metric that is technically obsolete anyway. But I just think it shouldn't be too much to ask that the book that has 344 pages on my shelf also has 344 pages on my tablet and with this it's possible within a few seconds. Attached are a standalone executable for 64-bit Windows as well as the python source code for other platforms. *If you're running the script, please note that Python 3.10 and the "ebooklib" library are required. ...I am also thinking of turning this tool into a calibre plugin, but that's a bit of a long term goal. Links: Source on GitHub Last edited by Thertzler; 04-02-2023 at 04:15 PM. Reason: Upgrade to 1.1.8 |
11-13-2022, 04:56 PM | #2 |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
Wow, that's certainly something I'll try on my Kobos! Thank you!
|
Advert | |
|
11-13-2022, 05:24 PM | #3 |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
...aaand I'm stumped already. How exactly does one use the Windows executable? I know nothing about coding and very little about command line operations, and can't figure it out. Where should I put the epub file? When I try to specify the file path in the command line, I get "Invalid argument" or "No such file or directory".
|
11-13-2022, 06:45 PM | #4 | |
Member
Posts: 13
Karma: 1138306
Join Date: Mar 2018
Device: none
|
Quote:
About the tool not liking the file path... Are there spaces in the epub's file name? If so, the path needs to be in quotation marks. |
|
11-13-2022, 06:54 PM | #5 | |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
Quote:
In short, I don't know what exactly I should put in the command line. No matter what I do, I get an error. Presumably I have the command wrong in some way. I don't really know anything about command line. |
|
Advert | |
|
11-14-2022, 07:46 AM | #6 | |
Member
Posts: 13
Karma: 1138306
Join Date: Mar 2018
Device: none
|
Quote:
Would you mind simply posting the command you are currently trying to run? |
|
11-14-2022, 02:24 PM | #7 | |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
Quote:
Code:
C:\Users\videv\Downloads\page_approximator_win_x64\page_approximator.exe .\EssexDogs.epub 466 |
|
11-14-2022, 03:45 PM | #8 |
Member
Posts: 13
Karma: 1138306
Join Date: Mar 2018
Device: none
|
So the most likely issue here is that is that your command line is not actually executing in the same director as the executable.
If you are simply opening the command prompt from the start menu it tends to start in your users directory or maybe in some system32 windows folder. It'll find the page_approximator.exe file because you provided the absolute path to it, not the epub because its path is relative. So either set the cmd location to that folder before running the other command with this: Code:
cd C:\Users\videv\Downloads\page_approximator_win_x64 Code:
C:\Users\videv\Downloads\page_approximator_win_x64\page_approximator.exe C:\Users\videv\Downloads\page_approximator_win_x64\EssexDogs.epub 466 |
11-14-2022, 03:52 PM | #9 |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
Many thanks! It worked now!
I'll try the book on my Sage. I suspect it may not work, as Kobo has their own page estimation methods, but can't know for sure until trying. |
11-14-2022, 04:03 PM | #10 |
Wizard
Posts: 1,520
Karma: 16300090
Join Date: Sep 2022
Device: Kobo Libra 2
|
Maybe it's out of scope, but would you consider adding an alternate mode for word-based page approximation? My understanding is that ADE uses the number of bytes to approximate pages, so books with lots of formatting end up with more pages than books with sparse formatting. It would be a nice ADE alternative to be able to number the pages based on a user-specified number of characters/words/lines per page, instead, so you could have a consistent page metric across all your e-books.
|
11-14-2022, 04:04 PM | #11 | |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
Quote:
|
|
11-14-2022, 04:24 PM | #12 |
Wizard
Posts: 1,520
Karma: 16300090
Join Date: Sep 2022
Device: Kobo Libra 2
|
That's good to know, but I much prefer to use a simple CLI utility when possible, especially in cases where batch processing is desirable.
|
11-14-2022, 05:05 PM | #13 |
Grand Sorcerer
Posts: 12,183
Karma: 233447200
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
Anyway, I'm sorry to report that this method doesn't work in Nickel (the stock reader on eink Kobo devices), either with epubs or kepubs. Epubs on a Kobo use the Adobe page numbering system and kepubs use 1 screen=1 page. There is no way to force the stock reader to use any other system. So this approximator would be of use only in KOReader.
I'm disappointed, even though I kind of expected this. But it would have been nice to use my own page numbering system. |
11-14-2022, 05:53 PM | #14 |
Wizard
Posts: 1,520
Karma: 16300090
Join Date: Sep 2022
Device: Kobo Libra 2
|
Ouch. I wonder if we could convince Kobo to support this in a future firmware release?
|
11-14-2022, 08:25 PM | #15 | ||
Member
Posts: 13
Karma: 1138306
Join Date: Mar 2018
Device: none
|
Quote:
In the newest release, which I have attached to this post, you can append the flag --page-map to the pagination command and it will add that file to the epub as well. I've downloaded the desktop version of ADE, and it displays the new page count correctly. I'd love to update the opening post with this as well but I don't see the edit button mentioned in the FAQ... some restriction for new users I assume? Quote:
But maybe a feature like that, some sort of page count suggestion, could be added. In order to find valid locations for page breaks the page approximator does indeed already create a "raw" representation of the book text without any xml/html markup and tags and it further limits the text to the content of tags that can reasonably be assumed to be visible to the reader (so the content of meta/head/style/script tags etc is omitted). This could be a pretty good basis for what you want and the character/lines limit would be easily doable. Words might get a bit more complicated, if only because they can be a bit of a pain to formally define and you might potentially end up with a more than a few hundred word difference depending on whether you count contractions as one or two words and other stuff like that. |
||
Tags |
epub, page breaks, page numbering, python, tool |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
embed epub3 in a web page | increase | ePub | 2 | 06-30-2020 12:41 PM |
Stop audio on another page for Epub3 | Barra | ePub | 7 | 09-12-2019 11:12 AM |
Print page range in viewer outputs single empty page | larzeb | Library Management | 2 | 04-30-2013 06:24 AM |
Start page on fixed layout epub3 | brunobruno | ePub | 12 | 03-30-2013 02:50 AM |
EPUB3 - Float audio pane with page turns | _savage | ePub | 22 | 01-30-2013 02:38 AM |