08-05-2015, 10:56 AM | #1 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
[GUI Plugin] English Noun Frequency
[GUI Plugin] English Noun Frequency
Summary: Determines 'English Noun Frequencies' for words in a particular book's text, and will optionally:
Questions & Answers: Spoiler:
Requires Minimum Calibre Version: 6.0.0 Version History: Spoiler:
Last edited by DaltonST; 02-20-2023 at 07:34 PM. Reason: Release 1.0.16 |
08-05-2015, 10:56 AM | #2 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
For future use only.
|
Advert | |
|
08-05-2015, 10:57 AM | #3 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
For future use.
|
10-01-2015, 11:27 AM | #4 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Release 1.0.4
Release 1.0.4 has been posted, and provides some enhanced ToolTips.
DaltonST |
10-06-2015, 03:55 PM | #5 |
Insert Real Name
Posts: 1
Karma: 10
Join Date: Aug 2011
Location: In books.
Device: Sony PRS-T1
|
Plugin no longer works on Calibre 2.40 (Windows 7).
Here is the Calibre Debug Log: Spoiler:
Last edited by BetterRed; 10-07-2015 at 05:18 AM. Reason: wrap debug output in code and spoiler tags |
Advert | |
|
10-06-2015, 06:47 PM | #6 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
ENF Works Perfectly in Calibre 2.40 on Windows 64bit
I just ran it against all of the books in one of my test libraries for Calibre 2.40 Windows 64bit, and it worked perfectly.
It looks to me that you are jumping to a conclusion without having posted any empirical data to prove that it is not the .epub's fault. For example, you did not attach a screen-snip indicating that the "Count Pages" plug-in actually found real pages of "text". That is quite common in .PDF files that were created from scans, since "images" are not "text". "Count Pages" might find zero pages of "text" in a .PDF that is 5mb in size. Your log showed no errors, and looks normal other than the fact that it extracted no text from the .epub. I suspect that your .epub has problems. I suggest that you 'fix' your .epub by: (1) reconverting it from an epub to an epub; (2) running it against the excellent "Modify Epub" plug-in, clicking almost all of the checkboxes; (3) running it against the "Count Pages" plug-in, confirming that is has "real" pages text, and is not just an .epub version of a scanned .PDF; and, (4) converting the reconverted and "fixed" .epub to a .txt format, and then running ENF again for that book. ENF will use any .txt it finds before using a .epub format, and will use a .epub format before using a .PDF format. The log indicates that priority, and also will indicate which format it used. After (4) above, open the .txt format in Notepad, and read it. Are there a large number of real English words? If there are, please PM me the .txt file so I can test with it. Thanks. DaltonST |
10-31-2015, 12:11 PM | #7 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Release 1.0.5
Release 1.0.5 has been posted. Minor performance tweaks.
Absolutely nothing was changed that would 'break' any ebook that processed properly in Release 1.0.4. Please note that .PDF files that were created from scans have "images", not "text". For that reason, ENF would find zero "text" in a .PDF that is physically huge in size. The "Count Pages" plugin would find nothing as well. If you have problems with a particular .EPUB file, please review this post for a suggested course of action: https://www.mobileread.com/forums/sho...d.php?t=263684 DaltonST |
04-17-2016, 03:46 PM | #8 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jan 2009
Location: South Pacific
Device: Kindle DX
|
An interesting plugin. I'd be interested in seeing a similar output of a frequency of "Proper" names and non-english word usage. The characters, places, and invented words do a lot to categorize and compare books. Seems like it would allow you to glossarize books and then compare glossaries to other works.
Pulling proper names out of the copyright pages and "ends" of the book might give you some interesting info on publishers, translators, editors, etc... |
06-07-2016, 06:04 PM | #9 |
Addict
Posts: 260
Karma: 139980
Join Date: Mar 2014
Device: Android
|
Custom word in frequency search
@DaltonST,
I have been looking at this plug-in and trying to apply it but am struggling. After re-reading your description here as well as the Q&A, I'm left with the following: 1) Is it possible to setup the plug-in to read capitalized words and not have to extract then turn into lowercase before reading? I have to wonder if this is what contributes to it taking a LONG time on ONE book. As someone with tens of thousands of books in Calibre, this plug-in then becomes a waste of time in the running (not in the data it could provide). Am I wrong about this contributing to the time it takes to run one book? (for Example, I could run Quality Check/Search Epubs and go through a lot of books in little time comparatively). I am curious, but I also concede that I do not know the code needed to make this work. 2) When I first installed the plug-in (seeing it only in Calibre's list of available plug-ins), I understood it to mean that I could tell it to include the frequency of words I defined. For example, say I want to know the frequency of the word "hall" in a book. This would be basic text and thus include combinations that include it such as "hallmark" as well as including any capitalized version such as "Hall" or "Hallway". Now, rereading the description and trying to play with the plug-in (at which point I noticed the time it took to run it on default settings for one book), I believe this is not possible. Is it possible that you could modify your app or create another based on similar principles that does a word count for user-specified words and creates tags based on this? The purpose would be just as you noted - info about a book that can be very helpful to a user. For example, in my case I'm not fond of books full of vulgarity. Sometimes, you just don't know what you are going to be reading. I'd like to take what I already do via Calibre and improve my "word existence" search to including a count of the frequency of the word I specify as well as creating tags in a customized column based on the returned info (rather than the comments - example of tag: hall-50) This will help me to better categorize books as to the content and feel of the book. If this isn't something you can do, do you know of a similar app? |
06-07-2016, 06:15 PM | #10 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
@jecilop:
The OP says exactly why it was written, and exactly what it does. What you want is not why it was written, and is not what it does. Sounds like you should uninstall it. DaltonST |
06-07-2016, 06:46 PM | #11 |
Addict
Posts: 260
Karma: 139980
Join Date: Mar 2014
Device: Android
|
Ok, thanks for that input. That was the next step, but I thought I civily asked you about it.
Please consider not everyone who asks about your app is a tool of some sort. I'm not criticizing it. I was just wondering if I missed something in my understanding or if you could expand on it if not. |
06-03-2017, 01:53 AM | #12 |
Junior Member
Posts: 1
Karma: 10
Join Date: Jun 2017
Location: WA, USA
Device: kindlefire
|
Hi, I found your plugin interesting, and oddly useful. I've been playing around with some ideas of how best to utilize a set of data (words) which several studies have been done on now, confirming results. When I read the description of this plug in I thought you might find it interesting as well.
I just posted, this morning actually a blog post with the full result data file, the published paper describing the intent and method of gathering. with a bit of purple writing around it... hey, I was tired. There is something there, but I'm just not sure what. Google, I also discovered this morning, is stepping up their involvement in the word game. with the: Sideways Dictionary: https://sidewaysdictionary.com/#/term/phishing And several other projects on the Jigsaw site https://jigsaw.google.com/projects/ My blog is at https://psyopwriter.blog/2017/06/02/...and-dominance/ I would be interested in hearing your thoughts if you have the time. Not sure when I'll stop by again. Thanks for the plugin however, it has given me several ideas. |
03-20-2018, 08:34 AM | #13 | |
Member
Posts: 22
Karma: 10
Join Date: Mar 2018
Device: Kindle Voyage
|
Please fix small bug.
After clicking on the button for choosing of default location for collection of csv:
Quote:
Code:
IsOsX() Code:
utils.IsOsX() |
|
03-20-2018, 10:41 AM | #14 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
No, actually, "isosx" is a constant. It was not imported prior to use. The correct fix is: from calibre.constants import isosx
I do not have OSX, so this OSX-specific code has never been tested before now. I will upload a new version in the near future. DaltonST |
03-21-2018, 06:38 AM | #15 |
Member
Posts: 22
Karma: 10
Join Date: Mar 2018
Device: Kindle Voyage
|
Working fix
Thank you for update, the bug is eliminated!
|
Tags |
comments, frequency, spanish, tags, translate |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] KindleUnpack - The Plugin | DiapDealer | Plugins | 495 | 10-19-2024 07:06 AM |
[GUI Plugin] Wordpress | frescogamba | Plugins | 11 | 04-06-2015 10:09 PM |
German -> English Dictionary and noun/verb forms | laylos | Amazon Kindle | 5 | 07-24-2014 12:40 AM |
[GUI Plugin] KiNotes | -axel- | Plugins | 0 | 07-14-2013 07:39 PM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 01:27 PM |