01-11-2016, 07:05 AM | #856 |
Daywalker
Posts: 29
Karma: 52
Join Date: Jul 2008
Device: Kindle Paperwhite
|
Btw, for the German language the formula to calculate the Flesh Reading Ease is different. I have created my own copy of the plugin as I have never figured out how to use the book language to automatically switch to the new algorithm.
As the plugin is still developed actively, maybe the following change can be integrated: # German Flesh Reading Ease score = 180 - text_analysis['averageWordsPerSentence'] - (58.5 * (text_analysis['syllableCount']/ text_analysis['wordCount'])) |
01-12-2016, 01:06 PM | #857 |
Enthusiast
Posts: 37
Karma: 10
Join Date: Jul 2014
Device: Kobo Mini
|
There is another (well four …) metric for German text, the "Wiener Sachtextformel"
Translation for the formula: MS percentage of words with three or more syllables SL average words in a sentence IW percentage of words with six or more letters ES percentage of words with one syllable There are also other metrics for English that can be found in a NLTK based implementation on github. |
Advert | |
|
01-12-2016, 06:52 PM | #858 |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Adding a language specific version of the statistics is easy. I just needed to add the calculations and decide which to use based on the language in the book.
But, adding more statistics, is a lot harder. The plugin has a cut-down version of the NLTK library. From the notes in the plugin, it probably only has the English statistics. So, that would have to change to the full version. Then the configuration would have to be changed. And are these extra stats to calculate, or alternatives to use in place of the three English stats? At the moment, I'm not interested in going through this. If someone is, then I'll happily help. |
01-15-2016, 03:45 AM | #859 |
Daywalker
Posts: 29
Karma: 52
Join Date: Jul 2008
Device: Kindle Paperwhite
|
Hello davidfor,
I'd appreciate if you could send me the code how to add a language specific version, or just post it here. Thanks! In my private version I have tried out the "german.pickle" from the nltk package (modified to work with the plugin) but the difference was <1%. I don't care much about a higher accuracy, e.g. whether the reading ease is 75.5 or 76.7. If it's easy to select the correct pickle file on the fly, well, then it makes sense to use that one. Regarding the "Wiener Sachtextformel", I am using the 4th variant which is calculated like this: score = (0.2656 * text_analysis['averageWordsPerSentence']) + (0.2744 * (text_analysis['complexwordCount'] * 100 / text_analysis['wordCount']) ) -1.693 It can replace the "Gunning-Fog-Index" (as "years of education"), which doesn't work for German books anyway. |
01-15-2016, 09:51 AM | #860 |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
OK, here is a beta that has the German version of the Flesh Reading Ease statistic. The code for getting the language is on jobs.py at line 167.
Code:
from calibre.utils.localization import get_lang lang = iterator.opf.language lang = get_lang() if not lang else lang And thinking about the extra statistics, I have thought of a way that might work to handle this. At the moment, there are five statistics: words, pages, Flesch Reading Ease, Flesch-Kincaid Grade and Gunning Fox Index. These are fixed and the options around them are where to store the results. My thought is to add the extra statistics, but make them selectable from a list. The word and page count would be kept as they are. For the others, have pairs of drop-down lists. The first of each pair lists the statistics. The second the column to store it in. With that, exactly which statistic used from the full set would be up to the user. I would probably limit this to three stats, but, with a little thought, it could be extended to as many as needed. I haven't looked enough at the NLTK code to see how easy it would be to replace the version in the plugin with a more complete version. For the simpler statistics that use calculations similar to those already in place, adding them in this way should be practical. Last edited by davidfor; 01-15-2016 at 09:52 AM. Reason: Yet again, I forgot to attach the file. |
Advert | |
|
01-20-2016, 06:55 AM | #861 |
Daywalker
Posts: 29
Karma: 52
Join Date: Jul 2008
Device: Kindle Paperwhite
|
I am not sure if the interface language helps much. My library is mixed with books in English, German and some French.
How can the language be retrieved from a book? That would be the preferred way to do it. |
01-20-2016, 07:30 AM | #862 | |
null operator (he/him)
Posts: 21,006
Karma: 27620706
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
BR |
|
01-20-2016, 07:31 AM | #863 |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
The code I am using attempts to get the language from the book. If it can't, then it uses the interface language. If we extend the statistics, there is a problem as an extra language specific file is used. At the moment, this is loaded early before the individual book languages are known, and that could be problem.
|
01-20-2016, 12:55 PM | #864 | |
Grand Sorcerer
Posts: 12,038
Karma: 7257323
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
|
|
01-20-2016, 02:02 PM | #865 | |
Connoisseur
Posts: 85
Karma: 10
Join Date: Oct 2014
Device: Kindle Paperwhite 2
|
Quote:
|
|
01-20-2016, 02:11 PM | #866 | |
Wizard
Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Quote:
I like this too |
|
01-20-2016, 07:07 PM | #867 |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Yes, the OPF in the book. And getting it form the database was in the back of my mind when I was writing the post, but this is happening inside a job, so I didn't think I have access to the database. It could be part of the data collected before starting the job. And that probably is a good idea as there is no guarantee that the copy of the book in the library has been updated with the latest metadata.
|
01-20-2016, 08:35 PM | #868 | |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
The method currently used for the word count is the same for all languages. It is a fairly simple method but is not to inaccurate. At least for English. I don't know about the other languages which is why I asked. If it wasn't for the other languages, I wouldn't bother about this as for nearly all uses, the count we have is close enough. But, I think the other languages should be treated properly. The other thing that has been discussed is the other statistics. These are English specific stats. The first mention was because someone had the German calculation for one of them. Having that used automatically was easy and sensible. But the rest of the stats are more of a problem. And there are other stats that make more sense for other languages. Adding a way to calculate them is a lot more complex. My comments about the other stats are really just me thinking out loud. At this point, I have no plan to implement them. There are other things I would prefer to do. Maybe in the future, I might get bored and return to it. Or maybe someone will see my comments and decide to do it. If someone does, I'll be very happy to help with suggestions, testing and other help. My plan for Count Pages is to release the changes as is (different word count algorithm, German version of one of the other stats) plus one other change. The other change is something someone else has done and is about making the plugin work better when called from other plugins. |
|
01-21-2016, 05:32 AM | #869 |
Daywalker
Posts: 29
Karma: 52
Join Date: Jul 2008
Device: Kindle Paperwhite
|
|
01-24-2016, 05:42 PM | #870 |
Junior Member
Posts: 3
Karma: 10
Join Date: Jan 2016
Device: Android device with Kindle
|
Is there anyway to generate apnx file from this plugin without the send to kindle method. I'm trying to create apnx file for my android device. I used apnx generator plugin but I the page number is not accurate. So I'm searching for anyway to create apnx file from the goodreads page number.
I don't have any kindle devices. I searched for a way to fake my android device as kindle device so that I can use the column number to generate apnx file but with no luck. I don't know why there is no way to generate apnx file for kindle application for android device when I choose send to device or in android device interface plugin setting. |
Tags |
count, count pages, page count, pages, plugin |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Quality Check | kiwidude | Plugins | 1214 | Yesterday 12:05 PM |
[GUI Plugin] Open With | kiwidude | Plugins | 403 | 04-01-2024 09:39 AM |
[GUI Plugin] Quick Preferences | kiwidude | Plugins | 62 | 03-17-2024 12:47 AM |
[GUI Plugin] Kindle Collections (old) | meme | Plugins | 2070 | 08-11-2014 01:02 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 01:27 PM |