Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 01-01-2009, 10:10 AM   #1
TadW
Uebermensch
TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.
 
TadW's Avatar
 
Posts: 2,583
Karma: 1094606
Join Date: Jul 2003
Location: Italy
Device: Kindle
How to Do Everything with PDF Files

The following article gives a good overview over what you can do with PDF files (without using the expensive Adobe Acrobat):

http://www.labnol.org/software/adobe...tutorial/6296/
TadW is offline   Reply With Quote
Old 01-01-2009, 10:24 AM   #2
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
good find
Nate the great is offline   Reply With Quote
Advert
Old 01-01-2009, 10:30 AM   #3
xianfox
Ebook Addict
xianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it is
 
xianfox's Avatar
 
Posts: 225
Karma: 2136
Join Date: Jul 2003
Location: Appleton, Wisconsin, USA
Device: Kindle Paperwhite Signature Edition
Thanks, some of that will come in handy at work.
xianfox is offline   Reply With Quote
Old 01-01-2009, 10:45 AM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 76,402
Karma: 136466962
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
But I notice no good way to convert from PDF.
JSWolf is offline   Reply With Quote
Old 01-02-2009, 06:10 PM   #5
smithno
Groupie
smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.smithno ought to be getting tired of karma fortunes by now.
 
Posts: 152
Karma: 222484
Join Date: Jul 2008
Location: Tenn., US
Device: jetBook, jetBook Lite, Kindle 3, Galaxy Note II
Quote:
Originally Posted by JSWolf View Post
But I notice no good way to convert from PDF.
PDF was designed as an output format. It will probably never be easy to manipulate.
smithno is offline   Reply With Quote
Advert
Old 01-02-2009, 07:57 PM   #6
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
Quote:
Originally Posted by JSWolf View Post
But I notice no good way to convert from PDF.
"Good" is a matter of conjecture Jon, the article suggests that "You can upload the PDF document to Zamzar and convert it any formats like doc, html, png, txt or rtf (rich text format). Alternatively, you can convert PDF to HTML using Gmail."

I have used ABBYY PDF Transformer 2.0, ABC Amber PDF Converter, Paperport, and several other packages over the years. There is not one solution for all cases and the correct choice depends on the specific PDF in question, the tools on your computer, what tools are currently available for free, what you tools you can get in a functioning trial copy, and how much money you are willing to spend on new tools.

While I am not the biggest fan of PDF for ebooks, PDFs have their place and I have created PDF files for the Sony Reader where I felt they were the best option.
RWood is offline   Reply With Quote
Old 01-03-2009, 03:57 AM   #7
alexxx
Connoisseur
alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.alexxx ought to be getting tired of karma fortunes by now.
 
Posts: 68
Karma: 479602
Join Date: Aug 2006
Device: Kindle DX
too many of the options proposed in the article involve the uploading of your document to some server.
Call me paranoid, but I don't like at all this kind of "services" - I want my documents to stay on <my> server.
Apart from that, under linux (which is not mentioned at all in the article) software exists to do practically any kind of conversion you need.



alessandro
alexxx is offline   Reply With Quote
Old 01-03-2009, 05:52 AM   #8
Flinx
Connoisseur
Flinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of light
 
Posts: 63
Karma: 12132
Join Date: Sep 2006
Location: Germany
Device: Cybook Muse Frontlight, Cybook Odyssey
Quote:
Originally Posted by alexxx View Post
Apart from that, under linux (which is not mentioned at all in the article) software exists to do practically any kind of conversion you need.
alessandro
Really? I did search for one and have found no Linux program at all that tries to convert from PDF to floating text with attributes and with paragraph recognition. The only program that generates useful output I could find is PdfGrabber, but I am still interested in a better solution.
Flinx is offline   Reply With Quote
Old 01-03-2009, 05:24 PM   #9
bookbinder
Connoisseur
bookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-books
 
bookbinder's Avatar
 
Posts: 67
Karma: 813
Join Date: Jun 2007
Location: Massachusetts, USA
Device: Kindle Paperwhite 2, FW:5.6.1
google books

I have a few scanned google books in pdf that I'm having a hard time converting to text, even following advice from the article. Has anyone done this successfully? I've tried:
-Zamzar (returns an unopenable doc file)
-Google mail (doesn't display pdf as html)
-Pdf2Word program
bookbinder is offline   Reply With Quote
Old 01-04-2009, 02:46 AM   #10
labnol
PDF Geek
labnol began at the beginning.
 
labnol's Avatar
 
Posts: 1
Karma: 10
Join Date: Jan 2009
Device: none
Use Google

Quote:
Originally Posted by bookbinder View Post
I have a few scanned google books in pdf that I'm having a hard time converting to text, even following advice from the article. Has anyone done this successfully?
You can upload the scanned PDF files to a public web server, link those files from web page and then wait for google bots to index those PDF. See complete instructions.
labnol is offline   Reply With Quote
Old 01-04-2009, 07:52 AM   #11
Flinx
Connoisseur
Flinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of light
 
Posts: 63
Karma: 12132
Join Date: Sep 2006
Location: Germany
Device: Cybook Muse Frontlight, Cybook Odyssey
Quote:
Originally Posted by labnol View Post
...wait for google bots to index those PDF.
The linked example shows why this way is essentially useless. The resulting text has line breaks on each line. A good converter for books has to try to set a line break only at the end of a paragraph.
Flinx is offline   Reply With Quote
Old 01-04-2009, 09:59 AM   #12
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by Flinx View Post
The linked example shows why this way is essentially useless. The resulting text has line breaks on each line. A good converter for books has to try to set a line break only at the end of a paragraph.
Really not true at all. You can also use the convention that two line breaks in a row indicates a new paragraph like TeX and LaTeX do. It is trivial to convert between the two conventions using some simple program or a one line script.
tompe is offline   Reply With Quote
Old 01-04-2009, 02:24 PM   #13
Flinx
Connoisseur
Flinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of lightFlinx is a glorious beacon of light
 
Posts: 63
Karma: 12132
Join Date: Sep 2006
Location: Germany
Device: Cybook Muse Frontlight, Cybook Odyssey
Quote:
Originally Posted by tompe View Post
Really not true at all. You can also use the convention that two line breaks in a row indicates a new paragraph
No, that is not really useful for the most standard PDFs. The text object in a PDF file does not contain a real line break. It contains the position where on the page it has to drawn and a number of characters. The result is a line of text.
The progam that makes the conversion has to estimate from the positions of the text objects in which order the lines come. Simple converters like the most available (including Acrobat) use one text object, convert it to text and set a line break at the end, resulting in one line of the output text. The better converters can try to join the separate text objects, if their horizontal start position is identical and the line is long enough. But this is a difficult job, and I have not yet found a program that works good enough for me.
Flinx is offline   Reply With Quote
Old 01-04-2009, 02:51 PM   #14
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by Flinx View Post
No, that is not really useful for the most standard PDFs. The text object in a PDF file does not contain a real line break. It contains the position where on the page it has to drawn and a number of characters. The result is a line of text.
The progam that makes the conversion has to estimate from the positions of the text objects in which order the lines come. Simple converters like the most available (including Acrobat) use one text object, convert it to text and set a line break at the end, resulting in one line of the output text. The better converters can try to join the separate text objects, if their horizontal start position is identical and the line is long enough. But this is a difficult job, and I have not yet found a program that works good enough for me.
That might be the case but there is no functional different between encoding paragraphs with two line breaks or one. What you are talking about is how go a converter is detecting a paragraph break but that has no necessary connection to how the encoding is done. You can argue that you loose information if you do not keep the line breaks in a paragraph since they are impossible to recreate but it is trivial to take a paragraph specified by using double line breaks and convert it to one line.
tompe is offline   Reply With Quote
Old 01-05-2009, 05:28 AM   #15
stonehat
Re-Iliadist
stonehat began at the beginning.
 
Posts: 23
Karma: 10
Join Date: Oct 2008
Device: none
From TFA:
"Most mobile phones can read PDF files."

I stopped reading after that.
stonehat is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
eBook PDF - free tool for creating PDF eBooks from text files KACartlidge PDF 6 01-04-2012 10:41 AM
Pdf Files don't show Kraut55 Amazon Kindle 4 12-28-2009 12:10 PM
PDF Files. AndyCapon iRex 16 06-20-2008 08:09 PM
...just for pdf files? sharp21 Which one should I buy? 32 10-17-2007 12:26 PM


All times are GMT -4. The time now is 03:06 AM.


MobileRead.com is a privately owned, operated and funded community.