02-03-2011, 06:08 PM | #46 | |
Calibre Plugins Developer
Posts: 4,685
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Quote:
However your responding post was much clearer and sounds to me the same as what we have been proposing in earlier posts, just with a combobox of automerge suboptions instead of radio buttons. Great stuff. |
|
02-03-2011, 06:18 PM | #47 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@Starson17: I find I like to work in bursts, surfing MR is a good way to let my brain degauss between coding sessions
Yes, that should work. I dont recall if it is (translated string', 'option') or ('option', 'translated string') but a bit of experimentation will tell you |
02-04-2011, 11:46 AM | #48 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
02-04-2011, 12:32 PM | #49 | ||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
The functionality for multiple matching records hasn't been mentioned here before, but in the existing automerge code (to be the "Ignore" option in the new code) we put a new incoming format into all matching records, if possible, and ignore it, if not. For "Overwrite" I'm going to overwrite or add the format to all - again that's easy and already done. The question is: what to do with the "New Record" option for multiple matching records? Suppose I have 5 book records already in the database, all with the same author/title, but some with the same format as the incoming format and some without. Is this a "create a new record" situation or not? 1) Shall I add the new incoming format to the records that don't have it and create exactly one new record? 2) Or do I not create the new record because the incoming format was already added to at least one existing book record? 3) Or do I create one new book record every time I find an existing book record I can't add the new format to (but add it to all records where possible) - thereby creating multiple identical new book records. 4) Or do I create one new record and not add it to any existing records? 5) Or do I create one new book record every time I find an existing book record I can't add the new format to (and not add it to any existing records). I lean towards option 1, but option 3 is how the new code is working now. Option 3 is stateless - as in I don't need to keep track of what I did (or will do) for other matching book records. I just check each matching book entry, compare the new format to the formats in that record and create a new record if I can't stuff the incoming format into the matching book record. Note that for option 3 you automatically get one independent record for every case where you already had that format in another record. You can then work your way through that list and manually "Merge-Delete Others" each record into the existing record that caused it to be created (where you wanted overwrite) or just delete it (where you wanted ignore) or just manually delete all but one (if you like option 1) Can I convince anyone option 3 is the best .... it's much easier - because it's already done. Comments? |
||
02-04-2011, 12:44 PM | #50 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I'd say 1 sounds right
|
02-04-2011, 03:31 PM | #51 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
02-04-2011, 05:40 PM | #52 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
02-04-2011, 06:37 PM | #53 |
Calibre Plugins Developer
Posts: 4,685
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Thanks Starson17, look forward to playing with it when Kovid has merged it in.
Guess that then leaves the onus next on me . I will need to change the Quick Preferences plugin to support the new automerge options. Then there is the whole issue of the "Duplicate Find" plugin. Charles had some really good food for thought on this that he sent me via e-mail which I need to digest on. I would rather he posted them here himself or ask him for permission to mention them as they are certainly worth discussion as to which approach to take for the plugin. |
02-07-2011, 03:58 PM | #54 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
02-07-2011, 07:09 PM | #55 | ||
Calibre Plugins Developer
Posts: 4,685
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Quote:
I wanted to respond to a comment I made/Starson17's response in a different thread here for further discussion: Quote:
In my mind these are the reasons why people may be in a duplicate situation (there could well be others I haven't thought of):
Also by keeping the same conservative matching logic, a user can have confidence that the duplicate results list that appears are "genuine" duplicate scenarios. Well unless their title or author is completely wrong of course, but that is something only visual inspection of the book format can identify. The results of a duplicate search using this logic may contain groups that require less manual visual inspection to merge together than others (i.e. groups which have multiple books but not duplicate formats, and groups that do have duplicate formats). However there are still problems which prevent the plugin from being "automatic" in merging those "safer" ones. As Starson17 at least is aware the problem is the book metadata. If you have only just added a duplicate to your library, then the "oldest" by book date duplicate is most likely (but not always) the one which contains the metadata you want to keep (series information, comments, ISBN, cover, conversion settings etc). That behaviour would effectively match what happens with automerge turned on today when adding books and you add new formats of a book. However what if (as frequently happens) both book records have been in your library for a while, so both have metadata assigned but they differ in content? Maybe they have different series names (or one has one, the other doesn't), etc, etc. That is why there are so many "merge" submenus - as a user you have been given the power to merge to cover lots of scenarios of wanting only certain data kept with a particular merge direction. Which (slowly, sorry) brings me back to my initial suggestion in the thread of wanting to use the power of the library view for the plugin (rather than a popup dialog displaying duplicate search results). Doing so means the user has the full merge menus that exist and they know today. They can pop open the edit metadata dialog or make changes directly in the grid before they merge. They can roll up down comparing covers/comments in the book view. If they have custom columns like "Read yes/no" these will be visible to help identify which version to keep. Of course they also have all the existing ability to view formats. Sure we could duplicate most of this functionality into a popup dialog, but that's not a great long term solution imho. What users don't have in the library view currently is a way of visually identifying duplicate groups. As Charles kindly suggested to me via email one possibility is to use a custom column, which stores a duplicate group number against all the potential candidates as a result of running the duplicate find. Perhaps you could also add a second column giving you some kind of informational message or severity (may not be needed). You would be able to use the tag browser and search capabilities of Calibre to query against/display your duplicate groups. We can also wrap that up in helper menu items in the plugin as well to make it easy to bring back your duplicates. That imho is the only way for users to have sufficient information available and gui options to make the "hard decisions" about resolving a merge group. The plugin would take responsibility for populating the duplicate groups custom column. With presumably the ability to toggle the custom columns in/out of your library view. And you could launch different types of duplicate searches with the plugin - initially the "exact author, fuzzy title" logic, but eventually other "fuzzier" searches. You can take your time resolving the duplicates across multiple Calibre sessions without rerunning the search since there is no popup dialog. Or you can run it again across a different subsets of records/different matching algorithms etc. Those are my long winded thoughts for now, comments appreciated as always. There are of course still issues, like clearing the duplicate group custom column values after a merge. And once you start supporting "fuzzier" matching algorithms, you have the issue of repeatedly looking at false positives unless we come up with a way of the user flagging exclusions over time. Last edited by kiwidude; 02-08-2011 at 08:50 AM. Reason: fixed typos |
||
02-08-2011, 08:56 AM | #56 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Update it - I had an error in the code, Kovid fixed part of it, I fixed some more, he improved my code logic, etc., so it's changed a few times.
Quote:
|
|
02-08-2011, 11:02 AM | #57 | ||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
|
||
02-08-2011, 11:21 AM | #58 |
Calibre Plugins Developer
Posts: 4,685
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Sorry for the long ramble, it was the closest I had come to trying to organise my thoughts. If I was heading off reservation with them I would much prefer the public tomato throwing for a few posts over days of wasted dev effort...
Yes I must confess to give additional credit to Charles I think he mentioned the same thing and as my post was super long already I didn't . I haven't used the highlighting feature personally as yet so wasn't sure how exactly it fit in but your description sounds great. |
02-08-2011, 11:59 AM | #59 | |
Wizard
Posts: 3,455
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
|
Quote:
This way you do not need to do any UI for merging. User simply runs plugin *unattended* (very important for large libraries ;-) ) and then sorts books by that column. Then the user can merge the duplicates at his/her leisure. |
|
02-08-2011, 03:36 PM | #60 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
|
|
Tags |
duplicate |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Duplicate Detection | albill | Calibre | 2 | 10-26-2010 03:21 PM |
Help with Chapter detection | ubergeeksov | Calibre | 0 | 09-02-2010 05:56 AM |
Device Detection doom | Alberto Franches | Calibre | 6 | 06-24-2010 06:38 PM |
Device detection? | totanus | ePub | 1 | 12-17-2009 08:05 AM |
Structure detection v5.5 and v6.2 | AlexBell | Calibre | 2 | 07-29-2009 11:11 PM |