02-07-2011, 08:07 PM | #1 |
Addict
Posts: 340
Karma: 43106
Join Date: Apr 2009
Location: Germany
Device: BeBook One, Pocketbook Touch, Pocketbook Touch HD
|
[Old Thread] mergin split html files with Calibre?
I think someone wrote how this works, and I even managed to do it myself, but I can't remember how, so ...
I downloaded an ebook in epub format. However, I noticed that there are many errors in the ebook. Also, there is no TOC. So I want to correct all this stuff. I know, an epub is just a zip file with a html file inside. But here is the problem: Like in most epubs, the html file is split into multiple small html files. I once read something like: Calibre can join this small html files into one big html file. When you convert, there is this option about splitting files if they are larger then 260kb. Change this to something like 99999kb. Then you would end up with an epub that has only 1 big html file, which could be easily edited. I tried to convert from epub, to epub, using this setting, but I am doing something wrong, because, there are still multiple html files after I exported the new ebook. Can anyone help me (if anyone can understand what I try to do)? Thanks in advance. |
02-07-2011, 08:31 PM | #2 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Calibre can't merge HTML that's already split. Your best bet is to convert to a format that is natively a single file - try text (using Markdown or Textile output), or RTF. If you're using non-ascii characters text is probably a better fit.
Edit your doc from there, then convert back. Last edited by ldolse; 02-07-2011 at 11:16 PM. |
Advert | |
|
02-07-2011, 10:51 PM | #3 | |
US Navy, Retired
Posts: 9,865
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
Alternatively you can do as ldolse said converting to text (using Markdown or Textile output), or RTF and save it as html filtered if html is your favorite editing medium. Good Luck. |
|
02-08-2011, 12:12 AM | #4 |
Junior Member
Posts: 8
Karma: 10
Join Date: Feb 2011
Device: iPhone
|
The latest version of Sigil absolutely will merge .html or .xhtml files inside an EPUB container into 1 file. I have used it to edit poorly edited EPUBs where chapters are split across files. So I merge the two files then chapter break the big file at the end of the chapter I was working on. I don't know why you would want to, but you could merge every single .xhtml file into 1 giant file or just combine then resplit the chapters like I am doing.
Sigil comes in Linux, Mac and Windows flavors and it is GPL Open Source. http://code.google.com/p/sigil/ |
02-08-2011, 12:55 AM | #5 |
Connoisseur
Posts: 80
Karma: 8320
Join Date: Apr 2009
Device: Ipod Touch
|
I'd recommend Sigil too, but sometimes when I need to have a single html file, and I'm too lazy to use Sigil, I cheat and use the debug feature in calibre. Convert the epub to mobi, then reconvert the mobi to epub, and use the processed html file.
|
Advert | |
|
02-08-2011, 04:25 PM | #6 |
Zealot
Posts: 110
Karma: 5176
Join Date: Dec 2010
Device: Mac OSX, iPad, iPod, & Nook
|
Mac OSX, Linux and Unix have a command line utility called 'textutil'.
It can concatenate (combine in sequence) several files into one single file. I don't know if Windows has a similar program. I used textutil several times to combine several html files into one file. As long as they are numbered sequentially you can use a wildcard in the concatenate command and it will grab all the files called 'foo1' thru 'foo9' for instance and stitch them together. You still need to clean up the output but it is fast and just takes one single command. http://www.unix.com/man-page/All/1/TEXTUTIL/ Archon |
02-09-2011, 11:47 AM | #7 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Calibre has a limited ability to merge html files:
http://calibre-ebook.com/user_manual...specific-order It's the fourth item in the FAQ, which has always struck me as kind of an odd importance for what seems to be a relatively uncommon thing to do with Calibre. |
02-09-2011, 12:10 PM | #8 | |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Quote:
I saw someone say in another thread that converting to mobipocket and back will merge the files to a single flow, but keep the majority of the html... Haven't tried it to confirm. |
|
11-30-2011, 04:30 PM | #9 |
Junior Member
Posts: 1
Karma: 10
Join Date: Nov 2011
Device: none
|
As Archon said, a command line will concatenate all those files pretty fast. In windows, open a cmd window, get to the active directory, and enter something like this:
copy <myebookname part0??.htm> NewFileName the question marks in the file name get replace by the sequential numbers in your split files. Assuming they are sequentially numbered. If not, you'll have to renumber them manually. NewFileName now contains all of the "parts" that got matched to the wild card '??' . You should see a list of the files scroll up the screen. Good Luck |
12-01-2011, 11:15 AM | #10 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
Check me if I'm wrong, but can't you just zip the contents of the folder with split html files, and then change the extension to epub?
EDIT: Ignore the above. I don't know what I'm talking about. Don't judge me too harshly; I can't read or write... Last edited by ElMiko; 12-01-2011 at 03:11 PM. |
12-01-2011, 11:46 AM | #11 | |
Well trained by Cats
Posts: 30,421
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
There are EPUB specific rules on what goes where and how it is stored. Tweak EPUB and Sigil take care of complying with the rules when they reassemble the package |
|
12-01-2011, 12:00 PM | #12 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
You both got tricked into responding to an old thread - this thread was created before htmlz was implemented - which based on the OP's description (source was an ePub likely created by an old version of Calibre) is now the best solution.
|
12-01-2011, 06:57 PM | #13 | |
US Navy, Retired
Posts: 9,865
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Welcome to Mobileread.
Quote:
Moderator Notice Glad to have you on board and I hope you continue to assist the community with your willingness to help answer questions where you have the requisite knowledge needed.
Do not open old threads to answer questions. Calibre is a a quick development project and there have been 40 revisions since the original question in this thread was asked. Your added information, while applicable, has been surpassed by the htmlz feature being added to calibre 32 revisions ago. |
|
12-03-2011, 01:19 PM | #14 | |
Addict
Posts: 340
Karma: 43106
Join Date: Apr 2009
Location: Germany
Device: BeBook One, Pocketbook Touch, Pocketbook Touch HD
|
Quote:
I am surprised, that none of the latest few postings mentions htmlz. Back when it was implented, there was a big "thanks" posting for this feature. So thanks again to all replies, but to anyone who reads this in the future: there is no more need to answer. Maybe a mod should close the thread. |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Old Thread] Avoid epub split in several html files? | mastroalex | Calibre | 18 | 12-03-2011 03:50 PM |
[Old Thread] Joining multiple html files | RosanaE | Calibre | 4 | 04-22-2011 06:56 PM |
Split file when converting files in Calibre | larlarcook | Conversion | 2 | 01-29-2011 12:26 AM |
Anyone else having trouble lately converting HTML files in Calibre? | ficbot | Workshop | 1 | 07-27-2009 04:15 AM |
[Old Thread] unable to convert ebooks(rtf, txt,lit,html,pdf) to lrf in calibre .4.131 | jackdeth191 | Calibre | 9 | 05-02-2009 02:55 AM |