06-12-2010, 03:55 PM | #1 |
Junior Member
Posts: 8
Karma: 10
Join Date: Jun 2010
Device: none
|
Calibre 7.2 problem
When converting pdf to epub using Calibre 7.2 getting Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html on every page?
The original pdf does not have this on it when viewing from adobe acrobat please help. I have sony prs 600 |
06-12-2010, 03:58 PM | #2 |
creator of calibre
Posts: 44,413
Karma: 23977332
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The original pdf does have it, is just hidden. Use the remove header option in preferences. Search this forum for ABC amber to see examples of the remove header setting.
|
Advert | |
|
06-12-2010, 04:31 PM | #3 |
Junior Member
Posts: 8
Karma: 10
Join Date: Jun 2010
Device: none
|
I have tried to use the header removal for Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html with no luck would greatly appreciate if someone could tell me exactly what to type in header removal, thank you
|
06-12-2010, 11:40 PM | #4 |
my parent's oops...
Posts: 485
Karma: 1477572
Join Date: Feb 2009
Device: Vx->Handera->Clie-> Axim->505->650->KPW/Aura ->L2->iOS/CBW
|
I use acrobat pro and can usually crop the header and footer easily using the crop dialog. Then I go to the document tab and choose examine document. After it searches the doc, it list the items it has found. I UNcheck metadata and bookmarks as I don't usually want those removed. I do however wish to remove hidden data, which is the header and footers I can no longer view so I leave the hidden data CHECKED. Save the pdf and import it into calibre. After that I can convert it with calibre and the headers should be gone.
|
09-17-2010, 10:08 AM | #5 | |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
if that requires advanced programming then I'll stick with this workaround 1. convert to .rtf. 2, open the .rtf file within the calibre library , in Word. 3. use find + replace all to strip all instances of the string. 4. save 5. convert from rtf to epub/mobi as needed. does not take long but has to be done for every book that contains the spam. I can't believe that anyone would actually want this spam left in their books so hard code it's detection removal into a future calibre release please |
|
Advert | |
|
09-17-2010, 10:30 AM | #6 |
Wizard
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
The problem is that the string for header/footer removal is book specific in the vast majority of cases and requires human intervention. If it is not then you could simply set these as defaults in the Calibre preferences.
|
09-17-2010, 12:17 PM | #7 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
well no - that string makes no mention of a specific book - it's generic spam for a conversion program. I have a big collection of old sci fi in .mobi , and that identical spam string is in lots of them - I guess because they were all converted with the same software.from lit to whatever.
also, it appears in other threads, so othe people have also encountered it. so what's the calibre code that will kill it & where does that code have to go, please. |
09-17-2010, 01:07 PM | #8 |
Wizard
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
You set the regex under Preferences->Conversion->Common Options->Structure Detection for the Header regular expression and tick the Remove Header box.
Last edited by itimpi; 09-17-2010 at 01:13 PM. |
09-17-2010, 02:19 PM | #9 | |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
what's "the regex" - I'm guessing that's shorthand for REGular EXpression but what's the exact string that I need to type into that box |
|
09-17-2010, 03:06 PM | #10 |
Wizard
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
Yes - regex means regular expression.
However without examining an example of the book in question is not that easy to work out what the exact regex will be. In the simple case it will simply be the string you want to remove, but normally there will be variations in the string (such as page numbers) which you have to use regex extended syntax to specify. Also, since the removal actually happens on the HTML that is an intermediate stage in the conversion you may also have to allow for additional things such as HTML tags being present which also need removing. Calibre allows you to get these intermediate files saved for examination for just this sort of purpose. It is this sort of variation that can make it hard to come up with an expression that matches all the cases you want. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Please help - Calibre problem | Cathy_G | Recipes | 3 | 09-28-2010 09:18 AM |
Calibre problem, Please help! | darleo | Calibre | 2 | 03-20-2010 12:07 PM |
PRS-600 problem with calibre | Nefertum | Sony Reader | 3 | 11-26-2009 09:00 AM |
Help with Calibre problem | judyhill | Introduce Yourself | 8 | 11-25-2009 11:51 AM |
Calibré problem (may be XP problem) | Hildebrandt | Calibre | 3 | 07-23-2009 02:04 PM |