12-14-2019, 05:22 AM | #1 | |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Failure to load French dictionnaries from .oxt
Hi
Now on Calibre 4.6. I tried to install the latest French dictionaries which are included as .dic and .aff files in a Grammalecte oxt that I had downloaded and installed on my hard disk (Archlinux). I failed Code:
calibre, version 4.6.0 ERREUR : Échec à l'importation des dictionnaires: Échec à l'importation des dictionnaires depuis /home/roger/Documents/Documents pour EPUB/extensions OO/Grammalecte-fr-v1.6.0.oxt. Cliquer "Afficher les détails" pour plus d'information Traceback (most recent call last): File "site-packages/calibre/gui2/tweak_book/spell.py", line 122, in accept File "site-packages/calibre/spell/import_from.py", line 126, in import_from_oxt File "src/lxml/etree.pyx", line 3222, in lxml.etree.fromstring File "src/lxml/parser.pxi", line 1877, in lxml.etree._parseMemoryDocument File "src/lxml/parser.pxi", line 1765, in lxml.etree._parseDoc File "src/lxml/parser.pxi", line 1127, in lxml.etree._BaseParser._parseDoc File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError XMLSyntaxError: Namespace prefix manifest on manifest is not defined, line 2, column 19 (line 2) Why zffqqo? My pref.json file in this folder contains Quote:
|
|
12-14-2019, 12:13 PM | #2 |
creator of calibre
Posts: 43,897
Karma: 22666668
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That indicates the oxt file is malformed, parsing its XML is failing
|
Advert | |
|
12-14-2019, 06:31 PM | #3 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Thanks for your reply.
So, I left aside this version of Grammalecte and installed the one from LibreOffice extension here: https://extensions.libreoffice.org/e...aires-francais The version of the dictionary dates back from 2017 but should be fit for the purpose. Indeed it installed nicely using strictly the Preferences folder and oxt install procedure. When I opened one ePub, and asked to check spelling, though it did appear to work, it did not filter enough the unknown words. I got a huge list of words, most of them correct. So, for this reason, I can't make any use of this list. Here is one example with the joint screenshot. Nearly all the words that appear on this screen have a correct spelling. To be more precise, if I click on the place designated with the arrow, it does not make any change to the list. I should see most of the entries disappearing. They all stay put. No filtering. These things had been working with the 3. series version of Calibre since as far as I can recall. P.S. I noticed in my .config folder, I ad .config/calibre/dictionaries and other regular items and .config/calibre/calibre/dictionaries and what seems to be an older double of the regular items I guess the top installation is the only one to be used and I did scrap the second one but it changed nothing to the results above. No filtering. Last edited by roger64; 12-14-2019 at 08:11 PM. Reason: double |
12-14-2019, 10:41 PM | #4 |
creator of calibre
Posts: 43,897
Karma: 22666668
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There have been no changes to spell checking since calibre 3. And filtering to only misspelled words definitely works, I just tested it. Presumably the dictionary is incomplete/missing those words/not working correctly. Where are you getting the dictionary from, as far as I can see the libreoffice french dictionary was updated last year not in 2017 https://github.com/LibreOffice/dicti...e/master/fr_FR
|
12-14-2019, 11:26 PM | #5 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Something must be wrong in my settings which prevents a proper filtering but I do not think this comes from the French dictionaries I install.
These ones are placed in the folder: .config/calibre/dictionaries and are chosen by default. I joined it here. |
Advert | |
|
12-14-2019, 11:49 PM | #6 |
creator of calibre
Posts: 43,897
Karma: 22666668
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I installed the frech dictionary from here: https://extensions.libreoffice.org/e...aires-francais
and added <p lang="fr>bombes</p> to a book and ran spell check and as expected the word bombes was filtered out. Since you are on linux, I suggest making sure you are running the official calibre binaries. |
12-15-2019, 05:51 AM | #7 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Calibre 4.6. for Linux has been downloaded from your official repo.
I obtain different results for the word "bombes". I put this word in <span lang="fr">bombes</span> and it disappeared from the list in both cases. As it is a genuine French word, it should have appeared in the list for all words and disappeared only (been filtered out) when I tick Display only mispellt words. Last edited by roger64; 12-15-2019 at 05:53 AM. Reason: tick |
12-18-2019, 10:09 PM | #8 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Hi
Downloading and installing again 4.6. Calibre binaries from official repo. Idem for French dictionary oxt from LibreOffice repo. The error signalled above (no filtering) is still here. Has been present from day 1 of Calibre 4. Spellchecking always worked with previous versions of Calibre. It works with Sigil (including the latest v1.) I wish there was some debug tool I could use to help track this. |
12-18-2019, 10:52 PM | #9 |
creator of calibre
Posts: 43,897
Karma: 22666668
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You probably have soft hyphens or similar invisible characters in your words. It has nothing to do with filtering, spell check if flagging your words as mis-spelled.
|
12-19-2019, 02:45 AM | #10 | |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
It can't be a strange character. I also tried with older books of mine. It can't be that.
Let me have an aside, who knows? Some few weeks ago, one program (called formateur de texte) was not working on my computer (archlinux). It's part of a bigger program for LibreOffice called "Grammalecte". First, Linux build was deemed to be the probable culprit by the developer. To cut the story short, finally, it happened that there was a change to be done in some python program, maybe a consequence of python 3.8 use, and it was necessary to replace time.clock() with time.perf_counter(). Quote:
Last edited by roger64; 12-19-2019 at 02:50 AM. |
|
12-19-2019, 10:52 AM | #11 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Hi
This is what I found: I found French books where filtering is working and others where it is not. All books have this code in the opf Code:
<dc:language>fr-FR</dc:language> Code:
<dc:language>fr</dc:language> With these "head", it's OK and filtering is working Code:
<?xml version='1.0' encoding='utf-8'?> <html xmlns="http://www.w3.org/1999/xhtml"> Code:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> Code:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="fr-FR" xml:lang="fr-FR"> Last edited by roger64; 12-19-2019 at 10:57 AM. |
12-19-2019, 12:02 PM | #12 |
creator of calibre
Posts: 43,897
Karma: 22666668
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It has nothing to do with headers, its because some fo the words are surrounded by narrow no-break spaces, and the updated ICU library calibre 4 uses, considers these part of the word, which for spelling purposes it definitely is not.
|
12-20-2019, 02:46 AM | #13 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
@Kovid
Thank you for finding the reason!! I was sure something had changed with version 4 but was totally out of my mind as to the reason... I did not suspect it because I have been using nnbsp in my epubs for many years. Does this mean that from now on, I'll have a choice: - either use nnbsp (authorized and even recommended by French punctuation rules) - or use Calibre French spellchecker? The updated ICU library made indeed a strange choice. Can this absurd situation be normalized to revert to the customary way? (i.e. to use nnbsp AND to use spellcheck) |
12-20-2019, 04:11 AM | #14 |
creator of calibre
Posts: 43,897
Karma: 22666668
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It's already done: https://github.com/kovidgoyal/calibr...7ed2a354d73d17
|
12-20-2019, 04:16 AM | #15 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Dictionnaries and annotations in CoolReader | ploum | PocketBook | 5 | 07-21-2019 02:50 PM |
Is there an OXT editor? | dimijykes | Editor | 1 | 05-06-2015 02:12 AM |
calibre failed to load after 1.12.0 down load | junglered | Recipes | 2 | 11-27-2013 10:15 PM |
Dictionnaries : Kindle vs Kobo | Jean-Luc | Which one should I buy? | 18 | 07-05-2013 03:59 AM |
Calibre, Arch Linux, and a failure to load nook touch | dataknight | Devices | 3 | 07-22-2012 10:30 PM |