10-10-2007, 12:31 AM | #1 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Extended characters
At the request of JSWolf, I am posting a new thread concerning the use of extended ASCII characters like curly quotes, em-dashes, apostrophes, etc.
I took "An Intimate Study of Sherlock Holmes", recently posted by RWood and tried to open it in FBReader. Like a lot of programs, FBReader didn't display the curly quotes and em-dashes correctly. Thinking that this was due to the use of the extended ASCII characters, instead of the equivalent HTML tags, I used Amber Palm Converter to get some HTML to experiment with. Although my experiment worked, it appeared that the Amber software makes substantial changes to the HTML that it creates. Below, I have attached a Zip file that contains an HTML file that hopefully is closer to the original. I used a program called MakeDoc to extract this file from the posted PRC. I took this HTML file and replaced all curly quotes, apostrophes and em-dashes with the HTML tags. This second file (also in the Zip) displayed correctly in FBReader. I don't mean to pick on just FBReader. I have also seen other programs not display extended ASCII characters correctly. I would think that using the HTML tags for these characters should display correctly, in all cases. As I mentioned in the other thread, I think this problem is due to the different interpretation of these characters, depending on the language and code page used (or the improper interpretation by the software - I don't know which). Last edited by jbenny; 10-10-2007 at 12:40 AM. |
10-10-2007, 12:36 AM | #2 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Although I mentioned this in the other post, I thought I would duplicate this here.
Single, left and right curly quotes are: ‘ ’ Double, left and right curly quotes are: “ ” The em-dash is: — Other extended ASCII characters and some foreign characters also have equivalent HTML tags. |
Advert | |
|
10-10-2007, 12:38 AM | #3 |
Resident Curmudgeon
Posts: 76,007
Karma: 134368292
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
If I do a simple PRC file then it works in FBReader. I didn't clean this up at all. Just loaded it and made it.
This is the simple-PRC made in BD. The Mobipocket PRC still had the same problems. So I think it's not the actual characters, but whatever format is being output by BD that is not fully compatible with FBReader. |
10-10-2007, 12:43 AM | #4 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
The file you posted did not display correctly for me in FBReader. Perhaps BD is substituting the actual character for the HTML tag. Maybe there is an option to disable this?
|
10-10-2007, 12:49 AM | #5 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
I just used MakeDoc to extract the HTML from that last PRC you posted. It did not contain the HTML tags that I used. As far as I know, MakeDoc doesn't change anything, it just extracts the files. So, it looks to me like BD is doing some character substitution.
|
Advert | |
|
10-10-2007, 01:18 AM | #6 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
OK, latest test. I took the Study2.html file that has the HTML tags that I added and used MakeDoc to create a Palm ebook. This opened and displayed correctly in FBReader.
Considering that a MobiPocket ebook is basically HTML, wrapped in a Palm PRC file, I don't know how different what MakeDoc created is from a real MobiPocket ebook. In any event, this seems to show that at least MakeDoc doesn't mess with the HTML tags for extended ASCII characters, like BD does. |
10-10-2007, 01:30 AM | #7 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
JSWolf, one other thing we could try is for you to again create a MobiPocket ebook in BD, using that second file I provided (with the HTML tags). Only this time, disable the use of compression when creating the MobiPocket file (if you can). The resulting file can then be looked at with a hex editor (I have one if you don't) to see whether BD did indeed substitute characters on us, as I believe is happening.
|
10-10-2007, 01:49 AM | #8 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Nevermind. Deleted.
Last edited by jbenny; 10-10-2007 at 01:52 AM. |
10-10-2007, 02:47 AM | #9 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
I've moved this thread to "Upload Help" so it doesn't show up in the book index.
|
10-10-2007, 03:49 AM | #10 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
Harry, thanks for moving this.
Not that it matters for the current problem, but I see from the file that RWood originally posted, that BD uses a code of "TEXtREAd", which corresponds to "Palm DOC" and not "MobiPocket". This is according to the MobileRead Wiki https://wiki.mobileread.com/wiki/PDB. As far as reader software displaying the ebook, I don't know if it matters or not. |
10-10-2007, 09:26 AM | #11 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
FWIW, the curly quotes of the original PRC show up fine in MobiPocket Reader on both the PC and Pocket PC versions. I'll try it on my iLiad when I get home from work.
|
10-10-2007, 09:36 AM | #12 |
Resident Curmudgeon
Posts: 76,007
Karma: 134368292
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The problem is when reading the PRC in FBReader you get garbage instead of some of the proper characters. If we use BD to create a simple-PRC instead of a Mobipocket PRC, FBReader displays these fine. Do you know the difference between Mobipocket and simple-PRC for most books made with BD?
|
10-10-2007, 09:49 AM | #13 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Simple PRC is basic PalmDoc. Plain text - no hyperlinks, no styles (bold, italic, centering, etc); no anything, basically . Not something you want to use.
|
10-10-2007, 10:36 AM | #14 |
Resident Curmudgeon
Posts: 76,007
Karma: 134368292
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I just noticed as I was fiddling with it. What formats can FBReader read that would keep styles and curly quotes?
|
10-10-2007, 05:19 PM | #15 |
Addict
Posts: 323
Karma: 358
Join Date: May 2007
Device: Tablet PC and Nokia N800
|
FBReader can read quite a few formats. However, I don't think that is the problem. Besides, asking people to create yet another format for submissions here seems counterproductive. Like I said, I think that the problem can be easily solved (in any format) by using the HTML tags, instead of the extended characters. I have seen this type of problem in other software before.
Attached is an ebook. I used the HTML that I extracted from the original ebook. I replaced the curly-quotes, curly apostrophes and em-dashes with HTML tags. I then used MakeDoc to create a PRC (without the images). This displays correctly in FBReader. From looking at the PRC that BD created, both BD and MakeDoc seem to be wrapping the HTML in a PRC file, with compression. My understanding is that this is essentially what a MobiPocket ebook is. The only real difference that I can see is that I used the HTML tags, so that those characters displayed correctly. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extended Warranty | vmill | Amazon Kindle | 9 | 09-17-2010 06:46 PM |
Accessories Extended Warranty | cvkemp | Amazon Kindle | 5 | 09-11-2010 12:09 AM |
What happened to my extended characters? | ChrisI | Sigil | 8 | 05-16-2010 07:31 AM |
Extended Warranty ?? | IvoryAngel | Amazon Kindle | 7 | 02-24-2010 05:26 PM |
Kindle Extended Warranty | ruddell | Amazon Kindle | 8 | 08-16-2009 01:26 PM |