11-25-2015, 04:01 PM | #31 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
A progress indicator ...
Might technically be difficult to create, but if IS possible, a checkbox to show it (with an advisory at the cost in slowness) would be greatly appreciated.
|
11-25-2015, 04:15 PM | #32 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
The Inter-Book Search is interesting BUT ...
I have an #origtitle column that I filled up prior to mass downloading metadata. Now, I want to compare the current title with what it was, case being ignored. I do this because the amount of picking the wrong book in mass meta downloading is getting out of control. I know that there will be a fair number of 'false' positives where an apostrophe might be added, but a cursory manual glance at the results SHOULD show me where the meta data download has, in fact, grabbed the WRONG book's meta data. I can then manually go through and correct them.
That's what I WANT to do. But every time I select Inter-Book Search and change the for each Book to the proper respective titles, I also change the middle column to NOT. I'm assuming that's what I want. BUT, when I click the Execute Search (All Books), it jumps back to AND. And THAT does not get me what I want. Intriguingly, the result IS interesting. I get the books that have Multiple instances of that title. I ended up with three 'sets' of books. In two pairs, the same title was used by two different authors. In the other instance, I had found two different formats of the same book. Thus, I was able to prune the dupe. Not a bad accidental result. BUT, is it possible to get what I was hoping? (and shouldn't conditions that change upon clicking an execute button, at least pause to inform the user why and offer a cancel opportunity?) So, to be clear for reproduction: I check ON Inter-Book Search, change the Left column For Each Book field to title. I change the Right column For Each Book field to #origtitle. I click the NOT radio button in the centre. Then I click the Execute Search [All Books] button. The radio button changes to AND and proceeds, taking, in my trial case, 303.656 seconds. There does not APPEAR to be any way to cancel the search. Thanks for this tool. I have the impression it WILL be a wonderful tool once I understand the intracacies. Indeed, I tried using title in both For Each field, finding all the books that DO have duplicated titles. (It DID allow me to fix SOME of the metadata'd changed titles BACK to the correct one and then do a manual update) Does MCS use exact (or case ignored) searching, or does it use the same fuzzy searching calibre uses, ignoring starting articles in the title? Appreciate your time and efforts, as always, GM Last edited by Gary_M_Mugford; 11-25-2015 at 04:26 PM. |
Advert | |
|
11-25-2015, 05:11 PM | #33 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
If you hover your mouse cursor over the checkbox "Inter-Book Search", a large yellow tooltips-box will open. It explains how to use "Inter-Book Search", including the "Required Settings".
See the attached image of the tooltips. Note "REQUIRED SETTINGS" item "[3]": 'AND' must be selected in the Middle Column. By design, MCS automatically 'jumps back' to 'AND' plus the other required settings whenever you attempt to use "Inter-Book Search". It doesn't tell you that you disregarded the explicit tooltips instructions; it just corrects your mistakes and continues with the search. MCS can use regular expressions to do 'fuzzy searches', but only for 'Intra-Book' Searches. MCS 'Inter-Book' Searches only use the '=' operator, as indicated in the yellow tooltips. The comparison is done implicitly, not explicitly, entirely within SQL using a collating sequence of "exact match, not case insensitive". MCS 'Inter-Book' Searches are an entirely different animal than 'Intra-Book' Searches. DaltonST |
11-25-2015, 07:40 PM | #34 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
Fair enough ....
You DID warn me. I am guilty of not having read the instructions thoroughly enough.
Accepted. Having said that, can you help me? I need to compare two columns and mark the ones that fail to be equal. Something along the lines of (pseudo code follows, NOT operational code) strcmp('title','#origtitle',lt,eq,gt) <> eq or select * from library where (uppercase(title) <> uppercase(origtitle)) MCS does already find the ones where a SERIES of books got renamed to one or another volume in the series. That happened in 1.3 per cent of the instances in my test data. But unfortunately, mass metadata downloading is failing in other ways, renaming books to something completely unrelated. Sometimes taking a non-fiction book and having it be something very fictional and very different. Those errors push the error in mass metadata downloading in my test sample to something in excess of 7 per cent. Which is NOT good. But MCS might be able to find those problems and let me fix them, making the use of mass metadata downloading a worthwhile endeavour. Thanks again, GM |
11-26-2015, 10:34 AM | #35 | |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Quote:
MCS 'Intra-Book' Searches were originally designed to do just that. See the attached example. What it does not (currently) do is to first transform the values to be compared prior to comparison, such as both to upper case. DaltonST |
|
Advert | |
|
11-26-2015, 03:28 PM | #36 | |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
MCS: 'Comparison Transform Functions'
Quote:
Release 1.0.19 includes a new feature: Optional 'Comparison Transform Functions'. See the attached images. Thanks for the good idea. DaltonST Last edited by DaltonST; 11-27-2015 at 12:28 PM. Reason: Version 1.0.19 |
|
11-29-2015, 04:27 AM | #37 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
Dalton,
The MCS ended up doing EXACTLY what I was looking for. The search produced a lot of noise. But, the ability to put the columns side by side made a scan for the three per cent of bad mass updates actually pretty easy. It's a bit OCD this collecting habit, what with the need to have the metadata just so right. But it's nice to have the tools to deal. Thanks. I didn't get it first try out working off your screencap. I guessed across first rather than up and down. For what it's worth, when the middle column's top half is used to 'turn off' the right column, I think you might, in fact, disable the right column and turn it ... pink or grey(er) or whatever. It might make it easier for those of us not in your mindset to stumble upon the richness of the tool in a more controlled way. I've been hesitant to dive into Q&S and Calm and the rest of the toolset since I'm wrapped up on my own programming right now. But it's obvious that I will have to. Will be back to you in the new year. Have the best of holiday seasons. GM |
11-29-2015, 04:34 AM | #38 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
Dalton,
Due to my predilection for Scandanavian mysteries, I find myself with ONE teensy request more, and that's for comparisons that ignore the extended character set diacriticals. Not sure whether to consolidate WITH the extended character attributes or whether just force everything back down to regular ascii. Thanks, GM |
11-29-2015, 04:47 AM | #39 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
Last one and I'll let you go. I headed back to Calibre after this and went to the plug in updater planning to hit your paypal button. Ooops. I understand philanthropy and the joy of giving at this time of the year. But it would help if there was a way to say thanks more than just mere words. GM
|
11-29-2015, 05:24 PM | #40 | |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
New Comparison Transform Function in Version 1.0.20
Quote:
See the attached example from Version 1.0.20. One word of caution: The greater that a particular metadata language's alphabet drifts from the Roman alphabet, the less accurate the new 'Compare as: Decomposed & Normalized Alphabet' Transform Function will become. Western European languages should (of course) work accurately, but Chinese, Japanese, Korean, Thai, and so forth will be (at best) much less accurate. The only Eastern European language I tested was Polish, and ĶźŽ was viewed as equal to KzZ for the purposes of searching, so it is likely that most of the Slavik languages will work well. Obviously, if all of the metadata is properly spelled in a particular language, then of course the search will work perfectly. The issues arise when they do not. For example, assume that the original title in Polish contained "ĶźŽŦ", but the translated title contained "KzZF". That would fail a check for equality, because the letter "Ŧ" does not transliterate to an "F". MCS would say they are different for that reason alone. DaltonST |
|
11-30-2015, 05:12 AM | #41 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
Dalton,
I think you're interpretation works within the confines of my interests. I wonder if you've considered SoundEx? I use it in an application where colour products from other firms are compared to my clients' to create a colour Rosetta Stone so to speak. Other code's involved of course, but the Soundex is the pre-pruning tool I used to make sense out of a LOT of data. Unfortunately, I'm not a Python programmer, just an old fogey pusher of Pascal code. Just saying it out loud in case your brain starts percolating. GM |
12-05-2015, 04:06 PM | #42 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Version 1.0.21 - New 'Fuzzy Logic' Comparison Functions.
Version 1.0.21 - 05 December 2015 New 'Fuzzy Logic' Comparison Functions.
Refer to the attached examples and also the ToolTips for an explanation. Be sure to read the ToolTips as pertain to the new dropdown selection boxes. DaltonST |
12-06-2015, 06:40 PM | #43 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Version 1.0.22 -Additional New 'Fuzzy Equality' Comparison Functions
Version 1.0.22 - 06 December 2015 Additional New "Fuzzy Equality" Comparison Functions. Refer to the attached example and also the ToolTips for an explanation.
Attached is an example of an MCS "Cross-Library" Intra-Book Search using one of the two new "Fuzzy Equality" Comparison Functions. DaltonST |
12-10-2015, 01:41 PM | #44 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Version 1.0.24 New: Double-Metaphone 'Sounds-Like' Fuzzy Equality Comparison Function
Version 1.0.24 - 10 December 2015 New: Double-Metaphone 'Sounds-Like' Fuzzy Equality Comparison Function.
An example is attached below. For more information about this phonetic algorithm: https://en.wikipedia.org/wiki/Metaphone#Double_Metaphone DaltonST |
12-11-2015, 10:43 AM | #45 |
Groupie
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
|
I have unleashed your inner phonetic OCD, obviously. To my benefit, i think. It's interesting that the Soundex library I used back at the start of the century is so ... antique. I really appreciate the Double_Metaphone link, although Pascal is noticeably absent from the list of supported languages for Metaphone3. Sigh. The Rosetta Stone we developed at my client has done well enough over the years to give the company an advantage, allowing them to figure out the best cross-version to their product to win over battles with other suppliers. But it sounds like a lot of the fussin' and cussin' I did to wrangle SoundEx into relevance for the project might have been reduced to about one line of metaphone3 code.
Continue to appreciate your efforts here. I have tested this on chunks of data that are of a considerable size and the return speed is really quite good. A progress gauge remains my one hope-for left. I think the intelligence in it is first rate and I'm STARTING to acquire your thinking and styling. Enough to suggest a serious run at the OTHER big project is in order. I admit to failing in my first attempt. But MCS is going to be my gateway to the other one. Thanks and PLEASE have the best of holiday seasons. GM |
Tags |
columns, search |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Search the Internet | kiwidude | Plugins | 434 | 09-30-2024 04:04 AM |
[GUI Plugin] Clipboard Search | kiwidude | Plugins | 29 | 04-02-2024 11:05 PM |
[GUI Plugin] Walk Search History | kiwidude | Plugins | 38 | 03-17-2024 01:47 AM |
[GUI Plugin] Recoll Full Text Search | Satas | Plugins | 16 | 08-05-2016 04:54 AM |
[GUI Plugin] Full Text Search (SOLR) | peterpisljar | Plugins | 2 | 08-09-2015 09:16 AM |