Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 04-28-2009, 06:39 AM   #1
Hanselda
Enthusiast
Hanselda began at the beginning.
 
Posts: 42
Karma: 12
Join Date: Feb 2008
Device: CyBook, Sony PRS 600
How to include a pdf crop plugin.

Hello all,

I have made several python scipts for my Cybook G3 device to deal with PDF format. I think I have done quite a lot of experimentations so that I would like to contribute my experience as several plugins.

The difficulty is that what I want to do does not quite fit into the design philosophy of the calibre that ANTHING -> HTML -> ANTHING ELSE. I would like to include the following:

1. Crop the pdf document margins to fit better to the reading device.

2. Split multi-column pdf document.

3. Convert PDF pages into images and compile the images again into a PDF file. This is due to the limited implementation of PDF format on many reading devices. If the PDF file contaims too many complicated designs the reading device would crash. However one can always "render" the PDF on the computer as images and make the PDF pages readable.

4. Convert Djvu into PDF. Basically the pages in Djvu format can also be treated as images, and compile again as pdf.

All of these conversions does not and should avoid the conversion to HTML in between. Basically I have already make several python scripts to do all those job. Could anyone give me some directions how could I start?

Thanks!
Hanselda is offline   Reply With Quote
Old 04-28-2009, 07:52 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,515
Karma: 24495784
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You're in luck, the new calibre conversion framework has support for two conversion paths:

1) input format -> html -> output format

2) input format -> sequence of images -> output format

(2) is already being used for conversion of comics (.cbr and .cbz). It can be used for image based converters of PDF and DJVU as well.

The first step is getting the new code running. You can check out the code using the command (in a unix OS)

bzr branch lp:~kovid/calibre/pluginize

Then run

python setup.py build
python setup.py develop

Now you can use the

ebook-convert input-ebook output-ebook

command to run the new conversion pipeline. Take a look at the python module calibre.ebooks.comic.input to see how to write an input plugin for an image based format. There is already a plugin for PDF, you can add options to it so that the user can select image based conversion instead of the default HTML based conversion.

If you need more help you can ask in the #calibre IRC channel on freenode, mail to the calibre-devs mailling list or just post here.
kovidgoyal is offline   Reply With Quote
Old 04-28-2009, 08:02 AM   #3
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Hanselda
1. Crop the pdf document margins to fit better to the reading device.
0.5.x includes the pdftrim utility to do this. 0.6 it has been moved to beocome a command in pdfmanipulate. EG. `pdfmanipulate crop input.pdf`

Quote:
Originally Posted by Hanselda
2. Split multi-column pdf document.
An old standing feature request that a lot of people want to see. However, because of the complexity of the task no work has been done toward it.

Quote:
Originally Posted by Hanselda
3. Convert PDF pages into images and compile the images again into a PDF file.
This is doable. All that would need to be done is extend the cover extraction (0.6) code to convert the entire document. Then save it as an oeb book so PDF output for comic conversion (I haven't completed it yet...) can assemble them all into a new PDF. It could be added and made a conversion option. However, cases concerning input size being different than output size would have to be handled.

Quote:
Originally Posted by Hanselda
4. Convert Djvu into PDF.
Hm... Djvu isn't really and ebook format...


Quote:
Originally Posted by Hanselda
All of these conversions does not and should avoid the conversion to HTML in between.
The conversion framework in 0.6 assembles everything into an eob book. While it is html based it is still handy for page ordering. In 3 it would still be helpful for this.

Quote:
Originally Posted by Hanselda
Basically I have already make several python scripts to do all those job. Could anyone give me some directions how could I start?
http://bazaar.launchpad.net/~kovid/c...re/ebooks/pdf/ You can see what is already done and what you can improve upon. I would recommend looking at the sticky for setting up a Calibre development VM. Also, right now the upcoming 0.6 is pluginize and it is where you would want to focus.
user_none is offline   Reply With Quote
Old 04-28-2009, 09:35 AM   #4
Hanselda
Enthusiast
Hanselda began at the beginning.
 
Posts: 42
Karma: 12
Join Date: Feb 2008
Device: CyBook, Sony PRS 600
Quote:
Originally Posted by Hanselda View Post
1. Crop the pdf document margins to fit better to the reading device.
This is exactly what I have done before. However I tried to made a GUI with PyQt for that so one can drag the area intended to crop. And would be nice to include it into calibre.

Quote:
Originally Posted by Hanselda View Post
2. Split multi-column pdf document.
There is a dirty way to do this. One can duplicate the page with multiple columns and create multiple pages out of this with the crop tools in 1. This works, although the final pdf file will be several times lager.
Hanselda is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to Crop Double Page PDF Files? picardo PDF 14 12-25-2010 02:07 PM
Crop PDF page margins with Skim Juggle4Evr Sony Reader 8 03-30-2009 03:12 PM
Removing print crop from PDF royboy99 Sony Reader 3 03-27-2009 10:15 AM
pdf crop linux x3oo PDF 2 03-08-2009 07:50 AM
Crop PDF. astra PDF 2 02-01-2009 05:03 PM


All times are GMT -4. The time now is 12:08 PM.


MobileRead.com is a privately owned, operated and funded community.