03-17-2024, 12:33 PM | #1 |
Enthusiast
Posts: 46
Karma: 10
Join Date: Dec 2009
Device: none
|
Single page vs. multiple pages (epub creation)
I'm trying to create an epub. The original source of the text is a PDF, but I downloaded an "epub" version generated by archive.org.
Unfortunately, the generated epub is far from perfect! The content is not split where it should be, etc. So I'm trying to clean everything up and create the final epub myself. I have already read about epub format and I have successfully created one. But I still have one question. In the process of collecting all the original text (I am a developer so I created some code for this), I ended up with a single large piece of text. I could split the text into multiple pages, but I'm wondering if this is really useful? Can't I just have one big "content.html" file, with "id" attributes on some "<p>" elements so that the "toc.ncx" can point to them? Are there advantages to actually splitting the content into several separate files instead of a big one? |
03-17-2024, 01:18 PM | #2 |
The Grand Mouse 高貴的老鼠
Posts: 72,544
Karma: 309960766
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
The advantage of having multiple files is that more ereaders will handle the ePub successfully. ereaders tend to be more limited in memory than most computers.
|
Advert | |
|
03-17-2024, 01:25 PM | #3 |
Enthusiast
Posts: 46
Karma: 10
Join Date: Dec 2009
Device: none
|
I see! Thanks for the information.
|
03-17-2024, 01:44 PM | #4 |
The Grand Mouse 高貴的老鼠
Posts: 72,544
Karma: 309960766
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
There are disadvantages too. Most ereaders enforce a page break between files.
|
03-17-2024, 01:47 PM | #5 |
Enthusiast
Posts: 46
Karma: 10
Join Date: Dec 2009
Device: none
|
Yes, I guessed that. You have to split the files at the right positions. Thanks again.
|
Advert | |
|
03-18-2024, 08:35 AM | #6 |
the rook, bossing Never.
Posts: 12,379
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
My experience is that the Internet Archive "ebooks" are worthless. They are generated automatically from un-proofed OCR text only marginally good for searching.
So I deleted them all and only download PDFs (after checking they are really PD) and read them on a tablet. You are better doing your own OCR of the PDF and proofing it. Do put page breaks at chapters, sections or other natural breaks in your wordprocessor. Later those will start new files in the epub. A new file is the only reliable page break and works for epub converted to mobi, azw3/KF8, dual mobi and KFX. |
03-18-2024, 11:38 AM | #7 |
Enthusiast
Posts: 46
Karma: 10
Join Date: Dec 2009
Device: none
|
Good to know. The text of the one I downloaded isn't too bad, it's just not formatted properly. I do find some OCR errors, but not too many.
Anyway, my opinion is that if you really want to do a good job, you need to read the final epub from A to Z once it's created, fix mistakes, and then create a final version. Some books really deserve that work. I plan to create epubs for other books too, so I might try what you suggest! I'll test some OCR applications to see how they work. Thanks. |
03-18-2024, 02:46 PM | #8 | |
Fool
Posts: 424
Karma: 3585252
Join Date: Feb 2003
Device: Kindle: Voyage,PW1,KOA, Kobo: Clara Colour, Nook GLP, Pocketbook verse
|
Quote:
|
|
03-18-2024, 05:08 PM | #9 |
the rook, bossing Never.
Posts: 12,379
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
About 100% of ones I downloaded were poor and more than 1/2 unusable.
Perhaps it depends on the source and content and when the scanned book was printed. Also check they aren't "pirated". Many other people have remarked that you are better off downloading the PDF. |
03-18-2024, 05:18 PM | #10 | |
Wizard
Posts: 1,365
Karma: 6794938
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
https://www.mobileread.com/forums/sh...93#post4341993 |
|
03-18-2024, 07:36 PM | #11 | |
Enthusiast
Posts: 46
Karma: 10
Join Date: Dec 2009
Device: none
|
Quote:
|
|
Tags |
content, creation, epub, pages |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Texts on the right-most page were chewed off [multiple pages in paged mode] | llinfeng | Viewer | 4 | 05-21-2021 01:42 PM |
How to convert multiple epub chapters into a single azw3 book | cliffsloane555 | Conversion | 2 | 11-08-2020 08:53 PM |
Multiple JPG images in SVG on single epub page | dbb1480 | Sigil | 7 | 05-20-2016 10:57 AM |
Page feeder does single-sided scans only. How to integrate pages... | u238110 | Workshop | 8 | 07-14-2014 04:25 PM |
Prev Page button skips back multiple pages | Yetchtoo | Amazon Kindle | 9 | 03-08-2010 01:48 PM |