![]() |
#1 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 7000000
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
|
Spell checking...every word is always shown at least once
Here's a sample of the spell check results on an eBook. This is just a sample, as this happens with all books.
Note that words that are obviously spelled correctly ("as", "at", "be", "bed", etc.) are listed, even though "Show only misspelled words" is checked. In addition, those words appear much more often than the listed count, but it looks like the spell checker thinks that only one instance of the word is misspelled. The eBook has the language set to "en" (no qualifiers like "en-US") in both the OPF and each HTML page. In "Manage Dictionaries", "United States" is set as the preferred variant for the English language. Is there any other config I should look for that might be the culprit? |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,180
Karma: 23000000
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
If you are using the builtin english dictionary and your books actually have language specified as english then it will work. So one of those conditions is not as you think.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,170
Karma: 57532200
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
You may have cross languages set someplace:
The Library view may say English, but the books OPF says something else <dc:language>en</dc:language> or the individual HTML Code:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en" xml:lang="en"> |
![]() |
![]() |
![]() |
#4 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 20,787
Karma: 27405072
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Look for spurious variants of 'are', 'at', 'been' etc in Tools->Reports->Words, I've occasionally seen something like this due to convoluted markup.
BR |
![]() |
![]() |
![]() |
#5 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 7000000
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
|
This is from a book covered by copyright, but I don't think the metadata I'm posting violates the rules...if it does, I'm sorry.
Header on each HTML page: Code:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops"> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> <title>Chasing the Dime</title> <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" /> </head> Code:
<?xml version="1.0" encoding="utf-8"?> <package version="2.0" unique-identifier="uid" xmlns="http://www.idpf.org/2007/opf"> <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf"> <dc:title>Chasing the Dime</dc:title> <dc:language>en</dc:language> <dc:identifier id="uid">3897963789</dc:identifier> <dc:creator>Connelly, Michael</dc:creator> <dc:publisher>Little, Brown and Company</dc:publisher> <dc:subject>Fiction / Thrillers / General</dc:subject> <dc:date opf:event="publication">2002-10-15</dc:date> <dc:rights>Copyright © 2002 by Hieronymus, Inc.</dc:rights> <meta name="output encoding" content="utf-8"/> <meta name="primary-writing-mode" content="horizontal-lr"/> <meta name="Sigil version" content="1.9.30"/> <dc:date opf:event="modification" xmlns:opf="http://www.idpf.org/2007/opf">2023-08-03</dc:date> </metadata> Code:
<p class="para-indent">“Well, it’s occupied at the moment but it might not be for long.”</p> Also note that only the menu item for spell check shows the word as spelled incorrectly. The editor does not purple underline the word. Last edited by nabsltd; 08-09-2023 at 09:44 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,228
Karma: 5390614
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
@nabsltd
In the Calibre Editor, move the cursor through each individual letter of the misspelt word. "be" in your example. Watch the bottom right corner and see if there are any spurious characters in the word. I know sounds silly, but there are hidden characters that can be added. I was able to show "be" as misspelt by adding a word joiner character. See image below... |
![]() |
![]() |
![]() |
#7 | |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 7000000
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
|
Quote:
Before: Code:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops"> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> <title>Chasing the Dime</title> <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" /> </head> Code:
<?xml version='1.0' encoding='utf-8'?> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en" xml:lang="en"> <head> <title>Chasing the Dime</title> <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css"/> </head> |
|
![]() |
![]() |
![]() |
#8 | |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 38,193
Karma: 152037714
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,180
Karma: 23000000
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
![]() |
![]() |
![]() |
#10 | ||
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 38,193
Karma: 152037714
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Quote:
When I last looked at the epub2 documentation and dug into the supporting documents, they referenced the XHTML 1.1 documentation which states: Quote:
|
||
![]() |
![]() |
![]() |
#11 | |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 7000000
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
|
Quote:
Those are my headers, and I did not ask the Calibre editor to change HTML tags. I asked it to replace content within an HTML tag. |
|
![]() |
![]() |
![]() |
#12 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,180
Karma: 23000000
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Feel free to not use the editor in that case.
|
![]() |
![]() |
![]() |
#13 | ||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,626
Karma: 23190435
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#14 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,180
Karma: 23000000
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
EPUB is not HTML it is XHTML and is not rendered directly by browsers. In XHTML served up with the correct XHTML MIME type, the doctype is not required: XHTML If you serve your page as XHTML using the application/xhtml+xml MIME type in the Content-Type HTTP header, you do not need a DOCTYPE to enable standards mode, as such documents always use 'full standards mode'. https://developer.mozilla.org/en-US/...rds_Mode#xhtml And even epubcheck agrees with me. It does not warn about missing DOCTYPE in more modern versions of EPUB than EPUB 2. |
|
![]() |
![]() |
![]() |
#15 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 431
Karma: 7000000
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
|
Do you actually think silently deleting headers during a spell check replace is acceptable behavior?
No other search and replace in the Calibre editor does this...only "Fix HTML" and "Beautify files" make these sort of header changes, and the user would be expecting such changes, and those can be reverted by using "See what changed". I'd argue that neither of these should change valid headers, either, but that's a different issue. This behavior definitely does not follow the principle of least astonishment. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Spell checking and PageEdit | softfoot | Sigil | 3 | 04-01-2021 09:22 AM |
spell checking with two or more languages | mcdummy | Editor | 5 | 07-23-2018 05:43 AM |
spell checking | brolny | Sigil | 1 | 09-18-2015 09:38 AM |
Multi-lingual spell checking | Stingo | Amazon Kindle | 6 | 11-19-2013 04:58 PM |