06-11-2010, 09:26 AM | #1 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
|
Numbers in pdfs not converting
I have inserted an example of what Im getting..the "squares" are supposed to be numbers.
The Destruction of Dresden was first published in Great Britain by William Kimber & Co. Ltd on April , ; in a revised and updated edition by Corgi Books Ltd in ; and by Papermac, a division of Macmillan Publishers Ltd, in . I am using Calibre to translate a pdf to epub. This is from epub. I have sigil bt dont have a clue how to isolate these occurrences of numbers throughout book and use a dif font I amusing Macbook Pro, ipad. |
06-11-2010, 10:27 AM | #2 |
frumious Bandersnatch
Posts: 7,534
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
It's possible that the PDF is using some custom glyphs in the font for the numbers, instead of the standard number positions. For instance, some fonts include both "old-style" numbers and "normal" numbers, one set is placed in the usual positions for numbers, while the other occupies other positions, and are therefore not recognized as numbers.
|
Advert | |
|
06-23-2010, 01:22 PM | #3 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
|
How to fix numbers
I am sure there are a million possibilities of WHY the numbers aren't converting but is there a way to choose a default font for an entire doc or possibly every occurrence of say...numbers?
|
06-23-2010, 03:52 PM | #4 |
temp. out of service
Posts: 2,798
Karma: 24285242
Join Date: May 2010
Location: Duisburg (DE)
Device: PB 623
|
the d be lower case numbers i assume
the problem is technically the same as removing ligatures |
06-24-2010, 08:43 AM | #5 | |
Wizard
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
|
Am I the only one who can see the numbers in this quote?
Quote:
Yes, their appearance is of "lowercase" or "old style" numerals. It should be possible to do a find and replace on them using this information. Rather than starting with calibre, you might want to use pdftohtml and pdfreflow to get an html file, and then find and replace in the html and then use calibre to convert to ePub. This is all scriptable. Something like: Code:
#!/bin/bash filename="$1" pdftohtml -xml "$filename" pdfreflow "${filename%.pdf}.xml" sed -i -e 'y//0123456789/' "${filename%.pdf}.html" ebook-convert "${filename%.pdf}.html" "${filename%.pdf}.epub" Last edited by frabjous; 06-24-2010 at 09:00 AM. |
|
Advert | |
|
06-25-2010, 09:57 AM | #6 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
|
I sent in the exact view of what Im seeing. They are legit fonts. The publisher is aware that the numbers use a dif font.
All I wanna know is THIS: How doyou isolate a font globally in a pdf or epub..and use a more generic font. If you have a fix for this..I would like to hear it. Im sure its quite simple, |
06-25-2010, 09:58 AM | #7 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
|
That sounds "do-able" think Ill try it!
|
06-25-2010, 10:20 AM | #8 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
|
OK
Heres what I did..for the Nuremberg.pdf file.. #!/bin/bash filename="$1" pdftohtml -xml "$NUREMBERG" pdfreflow "${NUREMBERG%.pdf}.xml" sed -i -e 'y//0123456789/' "${NUREMBERG%.pdf}.html" ebook-convert "${NUREMBERG%.pdf}.html" "${NUREMBERG%.pdf}.epub" But apparently I havent compiled the reflow thing right. I ran configure, then makefile. Heres the output I got when I ran the above -bash: ebook-convert: command not found MacBookPro:~ wth$ My machine is called Macbookpro and wth of course is my home name. Thanx for your help. |
06-25-2010, 12:16 PM | #9 |
frumious Bandersnatch
Posts: 7,534
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
"ebook-convert" is part of Calibre, do you have Calibre correctly installed? is it in your PATH?
|
06-25-2010, 05:18 PM | #10 | |||
Wizard
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
|
Quote:
fixnumbers.sh nuremberg.pdf and it should work. Quote:
Quote:
Last edited by frabjous; 06-25-2010 at 05:22 PM. |
|||
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting PDFs | macrotor | 62 | 08-14-2011 07:10 PM | |
Converting PDFs | JoshLessard | Amazon Kindle | 12 | 10-07-2010 06:40 AM |
reader for PDFs without converting? | kuck | Which one should I buy? | 24 | 06-30-2010 02:55 AM |
converting PDFs with equations | significance | Calibre | 6 | 10-25-2009 09:36 PM |
Converting PDFs to Images | fargo | iRex | 9 | 05-01-2008 11:34 PM |