02-12-2012, 03:41 PM | #1 |
Obsessive Compulsive?
Posts: 4
Karma: 54
Join Date: Feb 2012
Device: Kindle Keyboard
|
eBookCleaner
Ebooks, even retail ones, are often badly formatted, to the point that it detracts from the reading experience. And while editing programs exist, they fall short of the task. Do you have the time or patience to trawl through thousands of paragraphs, mangled html and css (div-div-div-p-span-span-br-span...)? A close friend of mine, 'burbleburbleburble' wrote a prototype of this program a while back. While he has more or less retired from it, I have taken up the project, and it has finally reached an acceptable level of sophistication.
As the situation currently stands: eBookCleaner works for me, and is highly customized for my needs. Still, I am interested and willing to work with the community to improve it as per everyone's needs. But I am looking for this to be a collaboration effort - I personally don't have the time to write and brainstorm every step of the way! Below is the source code, and I look forward to working with anyone who is interested in helping write a calibre plugin interface (again, I tried, but I simply do not have the time to learn how to do it and then properly implement it). [In the meantime, there is an updated standalone version at www.ebookcleaner.com.] (Note: anyone who is interested in somehow using the source, there is a 'help.txt' file in the documents subfolder. Somewhat sketchy though) I often don't make it to the internet more that a few time of week... please have patience when waiting for my response. Last edited by slobberchopz; 03-03-2012 at 02:34 PM. |
02-12-2012, 08:31 PM | #2 |
Connoisseur
Posts: 99
Karma: 280
Join Date: Nov 2010
Device: iPhone6, iPadMiniRetina, KindlePW3, KoboGloHD
|
Won't install
Just in case someone else is confused as I was. This is not a finished plugin so it won't install in Calibre. Thanks to the OP and I hope someone can help with the necessary Calibre plugin code since this looks like a great addition.
Last edited by ssholloway; 02-12-2012 at 08:34 PM. |
Advert | |
|
02-14-2012, 11:26 AM | #3 |
Banned
Posts: 132
Karma: 566638
Join Date: Aug 2011
Location: Wouldn't you like to know.
Device: Sony PRS-350:Sony PRS-T1:Rooted Nook Tablet
|
I actually would be interested in a stand-alone version of this, that is not Calibre dependant...I don't want to have to load my book into Calibre and re-format in to HTMLZ. I OCR several PDF files and export them as regular HTML files, so if it was able to use regular HTML vs. HTMLZ it would be a better fit for my needs....perhaps an updated version of the one on the website, with some documentation.
|
02-15-2012, 03:15 PM | #4 |
Obsessive Compulsive?
Posts: 4
Karma: 54
Join Date: Feb 2012
Device: Kindle Keyboard
|
Please, is there no one with an intimate knowledge of the calibre API can write a short ten or less lines of code that will retrieve the selected book in the requested format - I mean, shouldn't some basic functions that a plugin might use be easily available? Or documented? If so, can someone at least direct me to that information/example/documentation??
@jmaejr: I still have your email in mind. I haven't yet got around to the html reading part. Preferably, I would make a calibre plugin and let calibre read the html (no need to convert to htmlz) and give me the xhtml format. This would save a lot of time, errors, and debugging... Why reinvent the wheel? |
02-15-2012, 03:38 PM | #5 | |
Well trained by Cats
Posts: 30,516
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Take a peek. |
|
Advert | |
|
02-15-2012, 07:39 PM | #6 | |
Banned
Posts: 132
Karma: 566638
Join Date: Aug 2011
Location: Wouldn't you like to know.
Device: Sony PRS-350:Sony PRS-T1:Rooted Nook Tablet
|
Quote:
If I knew Python better I would mess with your source code... Waits patiently for this tool |
|
02-16-2012, 01:36 AM | #7 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Great idea
I too have considered converting my (freeware) stand-alone program, GuteBook, into a calibre Plugin, though I lack Python programming skills AND any knowledge of the calibre API!!! ( )
My GuteBook program extensively uses calibre commandline utilities, but as a Plugin it would escape it's DOS roots, be "embedded" within the calibre GUI and thereby allow easier integration with my GUI and functionality. However, a GuteBook plugin would not tweak/operate on existing ebooks, but rather would retrieve and create ebooks from Project Gutenberg "on the fly" after applying many "clean-ups" and fixes as your eBookCleaner would do. So, if you ever get this Plugin working, I may be knocking at your door to augment it for a GuteBook Plugin... I too look forward to seeing this Plugin to written! Good luck! |
02-16-2012, 04:51 AM | #8 |
creator of calibre
Posts: 44,662
Karma: 24966646
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The API to get and set a file in a calibre book record is simply
Code:
# This will get the epub format for the book identified by id=book_id. The epub file will be copied to a temp file and the path to the temp file will be returned. db.format(book_id, 'epub', index_is_id=True, as_path=True) # This will set the epub format for the book record identified by book_id stream = open(path_to_file, 'rb') db.add_format(book_id, 'epub', stream, index_is_id=True) |
02-21-2012, 02:27 AM | #9 |
Zealot
Posts: 107
Karma: 554
Join Date: Oct 2008
Device: none
|
@slobberchopz: Is this plugin related to "Ebook Cleaner" plugin by BurbleBurble? I've got version 0.0.8 of that plugin somewhere in this forum.
|
02-23-2012, 05:25 AM | #10 |
Obsessive Compulsive?
Posts: 4
Karma: 54
Join Date: Feb 2012
Device: Kindle Keyboard
|
Well folks, I tried writing the plugin code, but it took me for ever to work out the basics, and it still had a ways to go... and I just don't have the time. I still look forward to working with others on this, but by myself, who knows? Months, at best.
Meanwhile, I'll try to upload an updated standalone (but large...) version at www.ebookcleaner.com. I also updated the source code here (a huge amount of changes and improvements, and I restarted the numbering from version 1.0.0) for anyone interested in working with it to create a calibre plugin; but its back to py3.2 for now. @atjnjk: yes, it is based upon his project, but much improved. |