03-10-2024, 04:31 AM | #1 |
Junior Member
Posts: 4
Karma: 82
Join Date: Mar 2024
Device: Kobo Libra 2
|
Automatically create clean tags AND genres for your library
## Notes
The goal of this post is to provide arguably the best method of organizing your Calibre library without doing any manual classification. It will assign each book to a set of tags, and also assign each book to a single hierarchical genre (how physical book stores classify books). **IMPORTANT: This method will modify your Library irreversibly. You're strongly encouraged to backup your library before proceeding.** Tags & Genres This approach should result in your books having relevant tags, and having genres that a decent book store would place them in. Isaacson's "Benjamin Franklin" would get automatically get placed into Nonfiction.Biography, Frank Herbert's "Dune" would get placed into Fiction.Science Fiction, and so on. ### Background There are two general methods of categorizing information: tags and a genre hierarchical structures. They both have benefits and drawbacks. Tags provide a comprehensive way of assigning all relevant topics to a book. But they can often get disorganized, duplicated and unwieldy. It's not uncommon for people with large libraries to have thousands of inconsistently named tags (sciencefiction, Science Fiction, science-fiction) with many duplicates. Genres on the other hand aim to find the most relevant classification. While the book "Dune" could be accurately tagged "Adventure, Fantasy, Science Fiction, Novels, Space Opera", I think we'd all agree if you had to place it in a single aisle in a bookstore it would be best placed in the "Science Fiction" aisle. This method gives the best of both worlds, multiple tags and a single canonical hierarchical genre. ### Tagging On to the good stuff. First, we'll be using Goodreads as the sole source of tags for books. Goodreads provides the best tagging classification I've yet seen. There is less noise than the publisher provided tags, and tags seem to have been de-duplicated to provide a good starting point for classification. To get started with Goodreads, make sure you have the Goodreads plugin by Grant Drake. **Warning: This method requires you to wipe all your existing tags and rely on Goodreads provided tags.** First we'll be removing all existing tags on your books. #### Removing tags To remove all your existing tags, select all the books you'd like to update, right click context menu -> Edit metadata -> Edit metadata in bulk. Then go to the Basic metadata tab, and click the "Clear all" checkbox to the right of the "Remove tags" input field. Then finally click "OK" to clear the tags from your library. #### Adding Goodreads Tags Select all the books you'd like to update, right click to -> Edit metadata -> Download metadata and covers. The click the "Configure Download" button. Make sure that ONLY the Goodreads source is checked under "Metadata sources". This step is optional, but I prefer to have Goodread's original tags so I click "Configure selected source" for Goodreads, and uncheck the "Filter and map genres to calibre tags" button. While on this screen make sure that the "Tags" checkbox is selected (it should be by default). For the other metadata field you can leave them checked if you want to override your existing library metadata for those fields. Finally, click "Save". I also set the maximum number of tags to 10 (anymore than that for a single book will likely be noise. Here's my configuration: Click "Apply" to update configuration. And then on the Schedule Download page, you can click either "Download only metadata" if you only want metadata, or "Download both" if you'd like to download metadata and cover images from Goodreads. I already had cover images in my library, so I only clicked "Download only metadata" to speed up the process. But if you're missing cover images you might want to grab both. This process will take several minutes for larger libraries as Calibre updates your metadata with tags from Goodreads. Once this is complete, Calibre will ask you if you want to update your library with the metadata. Click "Yes" to agree. This will also take several minutes if you have a larger library. At the end of this process you should have most of your books that could be matched with updated tags from Goodreads. If all you want are clean tags in your library, you can stop the guide here. But if you'd like to classify books by Genre as well, then proceed. ### Auto Genres Now that you have a single source for your tags, the next step is to classify each book into a single hierarchical genre. I prefer to keep genres simple and have only a single level deep for the hierarchy. The top level genre for books will only be one of "Fiction" or "Nonfiction". The next level would be the single most relevant tag. Fortunately, Goodreads tags are returned with an important property: they are pre-sorted in order of relevance! That means to determine a books genre, we must first extract either the "Fiction" or "Nonfiction" tag from a book, then get the next most relevant tag. To do this, I'm going to start by creating an intermediate column called "Autogenre" which is automatically generated from the imported tags. Go to "Preferences" -> "Add your own columns". Make the "Lookup name" field "autogenre" and make the Column heading "Auto Genre". The "Column type" field should be "Column built from other columns". Finally under the "Template" field paste the following code: program: tags = field('tags'); unsorted_tags = raw_list('tags', ','); top_tag = str_in_list(tags, ',', 'Fiction', 'Fiction', 'Nonfiction'); subtags = list_difference(unsorted_tags, 'Nonfiction,Fiction,Audiobooks,Audiobook', ','); first_subtag = list_item(subtags, 0, ','); genre = list_union(first_subtag, top_tag, '.'); This code ensures we keep the default sorting provided by Goodreads (Calibre normally sorts tags alphabetically), then extracts the top level tag by checking if a book is tagged Fiction or not. If you'd like additional top level tags, you will need to modify this code, but keeping it simple with two top level tags, works for me. The most relevant genre (ex: "Science Fiction" for Dune) is determined by extracting the first tag that doesn't match any of the categories: 'Nonfiction,Fiction,Audiobooks,Audiobook'. Here's what it should look like before creating the column: Click "OK" to generate the "Auto Genre" field. ### Genres *You can skip this step if you don't want the genre in the tag browser. However I find it useful to quickly filter my Calibre library so I recommend this step.* You should now have tags and the "Auto Genre" fields populated. Unfortunately you cannot use the "Auto Genre" field in the tag browser because it's a computed field and Calibre needs a field with fixed data for the tag browser. To do this, we'll need to copy the values from the "Auto Genre" column into a fixed "Genre" column. The "Auto Genre" column updates every time tags are modified. But the "Genre" column will remain fixed after the copy operation. First, if you don't already have it, create a new column, with the "Lookup name" as "genre" and the "Column heading" as "Genre". The column type should be "Text, column shown in the tag browser". Once the Genre column is created you can now copy values from the Auto Genre column. First select all of your tagged books. Then go to "Edit metadata" -> "Edit metadata in bulk". Go to the "Search and replace" tab. Setup your search and replace settings as follows. Be sure that the "Search mode" field is set to "Regular expression". The "Search field" should be set to "#autogenre" and the "Destination field" should be set to "#genre". Both the "Search for" and "Replace with" fields should contain the exact same value, to ensure the fields get copied with no changes (Calibre required text so this is arbitrary, I use the letter "z"). You can save this search/replace rules as "AutoGenre to Genre" if you need to update your "Genre" column in the future after adding additional books. Now click "OK", and this should do a one-time generation of your "Genre" column. ### Wrap up Now you should have tags from Good reads, "Auto Genre" column automatically updated from tags as they change, and the "Genre" column which is static text. The "Genre" column should be showing in the tag browser, where you can filter you library, by Fiction, Nonfiction, or any of the respective subgenres. You will also be able to use the Genre or Auto Genre field to create directory structure on your e-reader based on genres. https://manual.calibre-ebook.com/sub...late-functions I didn't use the genre directory structure, but if you wanted it because you had a very large library, you could use a save to disk template like: ``` {#genre:subitems(0,1)||/}{#genre:subitems(1,2)||/}{title} - {authors} ``` This would create the directory structure like: Fiction/Science Fiction/Dune - Frank Herbert.epub Nonfiction/Biography/Benjamin Frankin - Walter Isaacson.epub I have a Kobo Device so I use the Auto Genre field to create collections on my device by setting the "Collections columns" field to "#autogenre" in the KoboTouch plugin. This process will vary if you're on another type of device. Last edited by pdurrant; 03-10-2024 at 06:27 AM. Reason: put images in their place in the text |
03-10-2024, 06:43 AM | #2 |
Grand Sorcerer
Posts: 6,393
Karma: 12408443
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
|
Just a piece of warning: Using genre from goodreads may set your Harry Potter book as "Realismo Mágico".....
|
Advert | |
|
03-10-2024, 09:11 AM | #3 |
Junior Member
Posts: 4
Karma: 82
Join Date: Mar 2024
Device: Kobo Libra 2
|
No it won't. Using this method all of your Harry Potter books would be automatically placed in the "Fiction.Fantasy" genre, which is the same I would choose if I had to do it manually.
Goodreads improved their tagging system at some point over the past years. Maybe it was more disorganized before. |
03-10-2024, 10:16 AM | #4 |
Grand Sorcerer
Posts: 6,393
Karma: 12408443
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
|
Ok, then it should be genres (as I already found it at that genre some months ago).
|
03-10-2024, 10:40 AM | #5 | |
the rook, bossing Never.
Posts: 12,344
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Quote:
Example: https://www.amazon.co.uk/dp/B004NBZFUW/ Been frequently reported. It's under Regency Romance, but this is neither Regency nor Romance. One of her Historical and non-romantic novels. She also wrote detective stories. Usually nether Amazon, nor their billboard (Goodreads) will fix anything. |
|
Advert | |
|
03-10-2024, 10:48 AM | #6 | |
Junior Member
Posts: 4
Karma: 82
Join Date: Mar 2024
Device: Kobo Libra 2
|
Quote:
Yes, it will also get tagged as Romance, Historical Romance, and Regency. But would you agree the genre is accurate? |
|
03-10-2024, 10:57 AM | #7 |
Junior Member
Posts: 4
Karma: 82
Join Date: Mar 2024
Device: Kobo Libra 2
|
One other point I didn't really mention is that the "Auto Genre" will automatically be generated for all future books. So if you're adding additional books to your library and have Calibre configured to automatically assign tags, you'll have semantically meaningful genres auto applied without doing anything after you've set this up the first time.
|
03-10-2024, 11:25 AM | #8 |
Groupie
Posts: 165
Karma: 3719858
Join Date: May 2013
Location: Parsons, KS
Device: 12th Gen Colorsoft SE, 2024 11” M4 iPad Pro & iPhone 15 Pro Max
|
This is brilliant! I’ve been wanting to redo my tags because they’re a mess. Metadata in indie books will have like 40 different tags and to me that’s not useful. I’m definitely trying this! Thanks!
|
03-10-2024, 12:45 PM | #9 |
Grand Sorcerer
Posts: 11,484
Karma: 230178176
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
For me the manual method is the only one that works. I don't want to use any tags or genres but my own, so no automatic method would work for me. For this reason I do all other metadata manually also, not by downloading.
|
03-10-2024, 08:36 PM | #10 |
Bibliophagist
Posts: 40,501
Karma: 156983616
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
I tried that with a random selection of 100 ebooks copied to their own library. Too many of the tags were to put it mildly, rather interesting. I never realized that in Eric Flint's 1632 universe, Kerryn Offord & Rick Boatright's 1636: The Chronicles of Dr. Gribbleflotz was a fiction.romance for instance.
|
03-23-2024, 11:41 AM | #11 |
Enthusiast
Posts: 40
Karma: 10
Join Date: May 2018
Device: Onyx Note Lite - Win10
|
@Firehose
This is really good! Thanks a lot for taking the time, and making the effort to explain how to set this up step by step to help other users. It made it very easy for me to get it working. I've just started using it moments ago, but at first looks it does indeed seem to work very well. Thanks again! |
03-23-2024, 12:04 PM | #12 |
Enthusiast
Posts: 40
Karma: 10
Join Date: May 2018
Device: Onyx Note Lite - Win10
|
@Firehose
Two questions for now: 1. Am I correct to think that books that have no 'genre' tag/information are getting labeled with 'nonfiction' by default? Can that be avoided? I think I prefer to have nothing showing when the genre is unknown. 2. Is it so that books that are labeled with several genre tags only show up in one node? E.g. if a book has the genre tags 'Science fiction' and 'Humor', it will only show up under 'Science fiction', and not under 'Humor'? (I hope not, and that I am doing something wrong ;-) Last edited by jist; 03-23-2024 at 02:43 PM. |
03-26-2024, 03:03 AM | #13 |
Cultivator
Posts: 94
Karma: 216
Join Date: Feb 2015
Device: PRST2
|
That's useful when you just have completely new library that already have many books inside.
The more you interact with your library, the more you want to do it manually. I just do it manually, after I view novelupdates tags, it's very detailed and "useful". |
03-31-2024, 10:36 AM | #14 |
Junior Member
Posts: 1
Karma: 10
Join Date: Mar 2024
Device: Kobo Aura
|
Unfortunately, doesn't work as expected at my end. All books get marked as Nonfiction, because there is no Fiction tag. Which means, that Child44 From Tom Rob Smith is marked as Nonfiction.Thriller.
It seems that many books (at least the ones in my library) don't have the tag/shelve Fiction or Nonfiction added. I changed Code:
top_tag = str_in_list(tags, ',', 'Fiction', 'Fiction', 'Nonfiction'); Code:
top_tag = list_item(tags, 0, ',') Code:
first_subtag = list_item(subtags, 0, ','); Code:
first_subtag = list_item(subtags, 1, ','); But, sad to say, tags seem to be just abritary words which means you will never get a real clean list. Thank for you effort and hope some others can use my changes to have it work for them. |
08-30-2024, 12:25 PM | #15 |
Junior Member
Posts: 1
Karma: 10
Join Date: Aug 2024
Device: PocketBook Era
|
This approach works great, thank you for writing it!
I'd like to add a couple of clarifications to make it work as you intended: #### Adding Goodreads Tags If Filter and map genres to calibre tags is selected, you have to manually add Fiction and Nonfiction as genre mappings (by clicking the Add genre mapping button), otherwise it will always filter them out and every book will be classified as Nonfiction. ### Genres After you've added the Genre column you need to go to Preferences > Tag browser > Hierarchy and searching and click the checkbox next to the Genre category, otherwise it won't recognise it as a hierarchical genre structure. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Send library to Poke-Pro using calibre column (genres) to create library folders | Bedurk | Onyx Boox | 4 | 01-03-2019 04:13 PM |
Request of suggestions for a strategy to create a CLEAN Calibre library | RotAnal | Library Management | 5 | 09-24-2012 02:01 AM |
tags/genres for series | salines | Library Management | 5 | 11-02-2011 11:29 AM |
My Genres won't list in my catalogue - only tags! | Acousticvillage | Library Management | 6 | 09-16-2011 07:22 AM |
Genres vs. Tags? | Caleb666 | Library Management | 18 | 08-06-2011 08:38 AM |