07-04-2020, 04:20 PM | #1 |
Junior Member
Posts: 8
Karma: 13090
Join Date: Jul 2020
Device: Kobo Forma
|
New bilingual Kobo dictionaries based on Wiktionary
Hi everyone!
As part of my project WikDict, I have created a large set of bilingual dictionaries in both the native Kobo dictionary format and Stardict format. If you want to have a look at the data quality before downloading any dictionaries, I recommend using the web interface at https://www.wikdict.com, which based on the same data source. Kobo dictionaries: http://download.wikdict.com/dictionaries/kobo/ Stardict dictionaries: http://download.wikdict.com/dictionaries/stardict/ License: CC-BY-SA 3.0 Please give me some feedback! |
07-04-2020, 04:59 PM | #2 |
Wizard
Posts: 2,792
Karma: 6990707
Join Date: May 2016
Location: Ontario, Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
|
Nice work overall!
I haven't taken a close look at it yet, but PyGlossary (based on your GH profile and the output, it appears that that's what you're using to convert the dictionaries) has some bugs in it's handling of prefixes for Kobo's dictionary format. In particular, uppercase words (e.g. Cumberland in sv-en) will appear in the autocomplete, but will appear as no definition found. See here for my notes on proper prefix generation. One way to fix this automatically without making modifications to PyGlossary would be to use dictzip-decompile and dictgen (see my thread about dictutil) to regenerate the dictionaries. |
Advert | |
|
07-05-2020, 07:08 AM | #3 | ||
Junior Member
Posts: 8
Karma: 13090
Join Date: Jul 2020
Device: Kobo Forma
|
Quote:
Quote:
|
||
07-05-2020, 12:03 PM | #4 |
Wizard
Posts: 2,792
Karma: 6990707
Join Date: May 2016
Location: Ontario, Canada
Device: Kobo Mini, Aura Edition 2 v1, Clara HD
|
That was in issue in the page due to whitespace collapsing. The first one had two spaces, but the second had one. I've responded on the issue on PyGlossary.
|
07-05-2020, 02:24 PM | #5 |
Junior Member
Posts: 8
Karma: 13090
Join Date: Jul 2020
Device: Kobo Forma
|
I updated the Kobo dictionaries with the recent pyglossary fixes. If anything else seems to be wrong, don't hesitate to let me know!
|
Advert | |
|
07-08-2020, 05:47 AM | #6 |
Connoisseur
Posts: 94
Karma: 12
Join Date: Nov 2018
Location: Salamanca
Device: kobo Clara HD, Onyxboox C67
|
Hi Karl, your job is spectacular. For fun I have used your 'teis' for making a single dictionary 'several languages - Spanish'. When you create new 'teis' I will add them to my dictionary. Thanks, many thanks.
|
02-13-2021, 12:30 PM | #7 |
Member
Posts: 21
Karma: 3620
Join Date: Feb 2021
Device: Pocketbook
|
Hi Karl,
It seems that EN-FR and EN-NL dictionaries miss some words/expressions: e.g. words between "trough" and "trouser press" are missing in the dicthtml-en-nl file and words between "trout" and "trouvère" in the dicthtml-en-fr file. So, the word "trouser" is missing in both of them (but is on the wikdict.com website). A bug in the convertion to dicthtml ? Thanks a lot for your great work ! |
02-24-2021, 09:05 AM | #8 | |
Junior Member
Posts: 8
Karma: 13090
Join Date: Jul 2020
Device: Kobo Forma
|
Quote:
https://en.wiktionary.org/wiki/trousers I'm not yet able to handle these references, so the translation is missing in the dictionary. Why is it shown on Wiktionary.com, then? Since I control the whole search and sorting UI on that page, I can take a bit more liberties when to show translations to the user. One aspect of that is that when no results are found, I try to synthesize translations by looking at Wiktionaries in other languages. In this specific case I returned a translation by going through the Finnish Wiktionary: trousers (en) -> housut (fi) -> pantalon (fr) If you would like updates to this issue, please subscribe to it on [github](https://github.com/karlb/wikdict-gen/issues/5). |
|
02-25-2021, 08:04 PM | #9 |
Connoisseur
Posts: 53
Karma: 20
Join Date: Apr 2017
Device: KK3G, PW, Voyage, Oasis, Aura One, Forma
|
|
02-26-2021, 02:54 AM | #10 | |
Junior Member
Posts: 8
Karma: 13090
Join Date: Jul 2020
Device: Kobo Forma
|
Quote:
* Download wiki dump * Read wiki markup from dump and convert to HTML pages * Combine HTML pages into StarDict dictionary Unless you are lucky and find tools that do this exactly in the way you want (or some already did the steps for you), this will involve at leas some amount of programming. Also that it can be cumbersome to find information in large wiki pages (e.g. translations in Wiktionary pages). Since I go the different route of using a [semantically parsed version of Wiktionary](http://kaiko.getalp.org/about-dbnary/) and creating my own pages from that, I can't give specific instructions. I'm also not generating StarDict files directly, but I'm using [pyglossary](https://github.com/ilius/pyglossary). |
|
02-26-2021, 04:02 AM | #11 |
Guru
Posts: 891
Karma: 270670
Join Date: Jun 2016
Device: Kobo
|
|
03-21-2021, 08:57 PM | #12 | |
Enthusiast
Posts: 25
Karma: 10
Join Date: Jan 2021
Device: kobo h2o libra
|
plse i need de-ar . fr-ar . it-ar
Quote:
:h elp: plse i need german/arabic italian/arabic french/arabic i am tired to search help but i am ignorant in informatic i have uploaded 3 very good dictionaried in epub form but i cant convert them to kobo forma thnks a lot if you help me |
|
03-21-2021, 08:59 PM | #13 | |
Enthusiast
Posts: 25
Karma: 10
Join Date: Jan 2021
Device: kobo h2o libra
|
plse ge-ar . fr-ar .it-ar
Quote:
hello :h elp: plse i need german/arabic italian/arabic french/arabic i am tired to search help but i am ignorant in informatic i have uploaded 3 very good dictionaried in epub form but i cant convert them to kobo forma thnks a lot if you help me |
|
03-21-2021, 09:02 PM | #14 | |
Enthusiast
Posts: 25
Karma: 10
Join Date: Jan 2021
Device: kobo h2o libra
|
plse i need de-ar . fr-ar . it-ar
Quote:
hello :h elp: plse i need german/arabic italian/arabic french/arabic i am tired to search help but i am ignorant in informatic i have uploaded 3 very good dictionaried in epub form but i cant convert them to kobo forma thnks a lot if you help me |
|
10-10-2022, 07:45 AM | #15 |
Member
Posts: 23
Karma: 10
Join Date: Jun 2022
Device: Kobo Libra 2
|
Norwegian Englisch dictionary
This file dicthtml-no-en.zip is converted from an unzipped stardict file. Original is stardict-comn_sdict05_norwegian-english-2.4.2.tar.bz2 using follow command :
penelope -i /home/petbest/Downloads/Dict//stardict-comn_sdict05_norwegian-english-2.4.2.zip -j stardict -f no -t en -p kobo -o /home/petbest/Downloads/Dict/dichthtml-no-en |
Tags |
dictionaries, dictionary, kobo, stardict, translation |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Bilingual w/Dictionaries | patrickt | Amazon Kindle | 2 | 03-15-2019 06:39 PM |
Any epub readers that can load bilingual dictionaries? | jdege | Android Devices | 5 | 08-29-2017 04:04 PM |
English Wiktionary for Pocketbook (En-En) | SIRSteiner | PocketBook | 26 | 09-04-2014 06:26 AM |
Bilingual dictionaries | cpina | Kobo Reader | 2 | 12-02-2012 09:59 AM |
Bilingual dictionaries | cpina | Barnes & Noble NOOK | 0 | 12-01-2012 09:01 PM |