08-03-2012, 03:02 PM | #301 |
Junior Member
Posts: 2
Karma: 10
Join Date: Aug 2011
Location: Scotland
Device: Samsung Galaxy Tab
|
Hope I'm in the right place, I posted a bug fix for the duplicate finder plug-in but in the wrong place (on the Calibre support site) and the 'plug in forum thread' link from user plug-ins didn't bring me here so I did a manual forum search.
Please redirect me if I'm still in the wrong place :-( Otherwise I hope someone can help. Here's my post: Great tool, saved me a lot of time, but two points: 1 - The binary check is not quite 100% I don't think it's safe to have a checkbox for 'automatically remove duplicates' as you could be deleting books that aren't duplicate. The plug-in selects different books as 'duplicates' about 0.1 percent or less times, but I don't want to delete any books that are not genuine duplicates so I use this to identify possible dupes but still save books to disc and use WhereIsIt! to dedupe them as it lets you include a check on file name as well as CRC. I think your plug in info says you guarantee duplicates are found, you should change the wording so people know. Suggestion: If your 'automatically delete' function was set to automatically delete only if the book title or author are the same' then that would probably reduce the chance of errors to an infinitesmal amount and I would use it to auto delete. You can easily do a manual review of the 'duplicates' that weren't auto deleted automatically and fix book name/author errors then run the tool again. 2 - Bug in 'clear duplicate results' I've used your tool several times and it found loads of duplicates, cleared the results ok then did another search. Unfortunately the 'clear duplicate results' doesn't seem to work any more so I can't use Calibre normally now. I cleared the duplicate results but when I click on an author or series I see every single book in the window and not just the author. If I do an author search via the command line I get the same thing. Attempts to fix: I shut down/restarted Calibre, updated the plug in, updated Calibre, but still the same. There must be something in the background settings that still thinks I want duplicates found. Reluctant to completely uninstall and reinstall Calibre and hope it won't come to this. Just tried to disable the plugin in 'user plugins' to see if that would clear it and put Calibre back to normal but says it can't be disabled :-( Using Windows Vista. Calibre 8.62. I can provide screen caps etc. Hope you can fix as this plug-in is a great idea. |
08-03-2012, 03:49 PM | #302 |
Calibre Plugins Developer
Posts: 4,678
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@joolzt - thx for the donation btw.
(1) The filename is completely irrelevant when it comes to identifying duplicates. Two files which have the same CRC using the SHA hash computed by this plugin most definitely are duplicates. This plugin can pickup books which have been incorrectly catalogued in a users library by differences in title/author which will result in a different filename, hence why it has no relevance to whether a book is considered a duplicate or not. Re (2) Not something I can replicate here, every time I clear regardless of whether I show one group at a time or all groups at once. You can either hit Escape, click on the clear search button next to Go, or click on "Clear duplicate results" on the Find Duplicates menu. In all those circumstances the search restriction gets cleared and the highlighting mode (red slash across the blue bars on button next to saved searches) will get displayed as being turned off. That is *assuming* you had it turned off before you had find duplicates mode (as it restores whatever "state" you were in prior to using the plugin). So if you had a search restriction or highlighting mode turned on before you entered find duplicates, that is exactly what will get restored when you clear out of it. |
Advert | |
|
08-03-2012, 06:36 PM | #303 | ||
Junior Member
Posts: 2
Karma: 10
Join Date: Aug 2011
Location: Scotland
Device: Samsung Galaxy Tab
|
No prob, wasn't intended as a bribe for a reply , was genuine thanks for saving me loads of time and speeding up removal of dupes.
Quote:
Quote:
I repeated the search. I set it to show all groups at once and to sort by the number of duplicates, so I assumed that the results would show groups of duplicate files together. So when I saw this..... author 1 series 1 title 1 size 0.1 author 1 series 1 title 1 size 0.1 author 2 series 2 title 2 size 8.0 author 3 series 3 title 3 size 6.6 author 3 series 3 title 3 size 6.6 ... and realised that item 3 couldn't possibly match those before or after, so I jumped to the conclusion that a few files were being picked up as dupes in error. Most of my huge list of dupes (99%+) appeared together in groups which were clearly sets of duplicates, even if there were punctuation differences or missing series names, so these out of order files made me wary. Now that I can browse 'author 2' correctly, I see that there are definitely two 'author 2 series 2 title 2 size 8.0' books, but they must just be sorted in different places in the original list. I understand that there may be multiple duplicate matches for a book record with several formats in it, but the third file above was a pdf that didn't relate to those it was sorted with, so it got my systems analysis nose twitching. I'd hoped that, once my dupes are down to a manageable level, I'd be able to sort dupes in groups and skim through looking for anomalies (as I did above) before searching again with the auto delete on. My assumption that the list is sorted in groups must be wrong, but it's still a great tool. Of course, I'm still wary of an autodelete based on CRC. If I have identical books but with different authors and titles due to an error and they are sorted apart from each other on the list then I wouldn't notice different title/author in the same 'dupe group' and a computer couldn't decide which was correct, so I come back to my original point that having an optional secondary check before an auto delete might be a good thing. Then again, once I have few enough dupes to show and process one group at a time this becomes irrelevant, it's just that I have far too many dupes at the moment to do that, which is why I am using 'show all' and saving to disc in batches and using WhereIsIt! for bulk deduping. Your plug-in is still a great help as it picks up all the probable dupes for me to check, and once my lib is clean I can maintain it just with the plug-in. Thanks for creating this plug-in, it's a great help. |
||
08-03-2012, 07:05 PM | #304 |
Calibre Plugins Developer
Posts: 4,678
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Hey I don't mind if it was a bribe . It sure beats people who come to the threads and *demand* that an xyz feature or change must be made...
Glad to hear you got #2 sorted. As for the deletion, please take *no notice* of the "size" column. It is an utterly meaningless column (I don't ever bother displaying it) unless you only ever store one book format. It is only ever going to show you the size of the largest format, and even that is only at a "point in time" - editing the files using external tools like Sigil will make that number out of date. This plugin compares the exact file size in bytes of the physical file, and only if those match does it then do the next step of computing and comparing an SHA hash. As I mentioned above - when it says you have a binary duplicate, it really *is* a duplicate. The auto-delete function simply removes one of those binary copies, it doesn't touch your book records in calibre, so you lose zero data. I only added the feature as a convenience for users for two reasons: (1) Since it is 100% safe to remove the duplicate file, it automates something that users otherwise expend a lot of the effort of one by one going through to do. (2) It is impossible in the calibre GUI to show *which* format is the binary duplicate, in the scenario where both book records have multiple formats the same. Which means the users is left confused trying to work out which format it is safe for them to delete. So... since it is 100% safe to delete them, I really don't think adding another dialog in there is necessary. That checkbox option to delete them is turned off by default in the plugin, but there is absolutely no downside to turning it on. Anyways, enough on this. If you still aren't convinced, simply uncheck the option |
08-04-2012, 10:33 PM | #305 | |
null operator (he/him)
Posts: 20,952
Karma: 27620688
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
I use the save list feature which I print I use a pen to strike out what I've done. One idea I had was move the duplicates from the two libraries to a third library. Then when you've resolved the conflicts - you move what's left to where you want it. For me that would be a 'nice to have'. Over to kiwidude BR Last edited by BetterRed; 08-05-2012 at 02:43 AM. Reason: clarity |
|
Advert | |
|
08-06-2012, 02:59 AM | #306 |
Enthusiast
Posts: 29
Karma: 10
Join Date: Jul 2012
Device: Kindle 3
|
In my case, I have my "full library" and another one for "massive imports". When I download a new collection of books, I add them to the Import library. Then I want to check duplicates against my full library, so I can delete the duplicated ones of the import library and finally insert in the full library those books that rest in the import library.
So, in my case, it would be enough the easiest solution, that is to keep selected the duplicated books of the open library. |
08-11-2012, 07:38 AM | #307 |
Enthusiast
Posts: 42
Karma: 13798
Join Date: Feb 2011
Device: kindle 3
|
find Unique
I was wondering how hard it would be to get a plugin that is the reverse of find duplicates. In other words one to compare libraries for unique books.
I love the last enhancement allowing you to find duplicates between libraries and would love to be able to find unique books when also comparing libraries Many thanks for the great work done so far |
08-11-2012, 07:41 AM | #308 |
US Navy, Retired
Posts: 9,865
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
|
08-12-2012, 04:04 AM | #309 | |
Enthusiast
Posts: 42
Karma: 13798
Join Date: Feb 2011
Device: kindle 3
|
Quote:
That way if I am looking at a 4,500 book library that is being compared against my 25,000 book library it will tell me which books I don't have rather than which are duplicates. In this case the duplicate numbers would probably be about 4,200 and the books I don't have around 300. This plugin would make it easier to identify the 300 |
|
08-13-2012, 06:32 AM | #310 |
Calibre Plugins Developer
Posts: 4,678
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
v1.5.3 Released
Changes in this release:
It will display for all search types except for "Ignore Title" (author based) searches. I couldn't see the point in displaying all books for an author in that circumstance. Re the query about showing books that re not duplicates. After running this check now, you will see marked:library_duplicates in the search bar. If you then type "not marked:library_duplicates" this will give you all the books that are not duplicates according to whatever duplicates criteria you searched on. I'm not entirely sure of the use case but the simple fact is you can do it if you need to. |
08-14-2012, 04:24 AM | #311 |
Enthusiast
Posts: 29
Karma: 10
Join Date: Jul 2012
Device: Kindle 3
|
Thanks a lot kiwidude
|
08-17-2012, 08:15 AM | #312 | |
Enthusiast
Posts: 42
Karma: 13798
Join Date: Feb 2011
Device: kindle 3
|
Quote:
Thanks again |
|
08-23-2012, 08:31 PM | #313 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jul 2012
Device: Kindle Touch
|
Kiwidude
Thanks for all the awesome plugins! You've made a huge contribution to Calibre usability and helped so many of us. Thanks for the ability for showing books in one library that are missing from another. I was about to ask if that was possible. Suggestion for the Metadata Variations screen (which is excellent for bulk updates of new libraries): When looking at Author variations, enable any of the shown variations to be easily taken as the Rename To name for them all. Some of my authors appear in 3 or more forms. An excellent enhancement would enable clicking on any of the alternate names on the right hand list to copy that name down as the Rename To name. Last edited by chis; 08-24-2012 at 12:26 AM. |
08-24-2012, 05:23 AM | #314 |
Calibre Plugins Developer
Posts: 4,678
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@chis,
Thanks for the kind words and glad the plugins have been useful to you. The problem with the selection idea is that there is already a purpose for clicking on the right-side - it controls the ability to select/deselect items. As it is possible that you don't want to rename all of the variations that it found. The way I saw people could handle the "this is not the name you were looking for" issue is that if you scroll down the list on the left hand side and find that variation there. All the permutations are available in that list. I can't think of an alternative way just at the moment, though if someone has a suggestion I will consider it. Enjoy the plugins. |
09-09-2012, 07:40 PM | #315 |
Member
Posts: 21
Karma: 10
Join Date: Jul 2011
Device: Sony PRS 650
|
please tell me there is a way to say: DELETE ALL DUPLICATE FOUND!
I am currently have 1 library with around 40 procent duplicates... they are 1 on 1 duplicates from an old library!! how do I remove the copies? finding it is great but I am not deleting them 1 by 1 if there is way to remove the duplicates PLEASE tell me! thx Last edited by BelgarionNL; 09-09-2012 at 08:10 PM. |
Tags |
cross library duplicates, in library duplicates |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Quality Check | kiwidude | Plugins | 1198 | 10-05-2024 01:14 AM |
[GUI Plugin] Generate Cover | kiwidude | Plugins | 833 | 09-13-2024 11:42 AM |
[GUI Plugin] View Manager | kiwidude | Plugins | 415 | 05-11-2024 03:28 AM |
[GUI Plugin] Open With | kiwidude | Plugins | 403 | 04-01-2024 08:39 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |