09-13-2024, 12:27 PM | #16 |
Grand Sorcerer
Posts: 12,171
Karma: 7908995
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
|
09-13-2024, 12:37 PM | #17 |
want to learn what I want
Posts: 1,416
Karma: 6874872
Join Date: Sep 2020
Device: none
|
11877
|
09-13-2024, 03:14 PM | #18 |
Grand Sorcerer
Posts: 12,171
Karma: 7908995
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
@Comfy_n: I am trying to see why this takes so long and failing. Having a large library to work with will help. Any chance you would be willing to share your metadata.db and the contents of the .cal_notes folder? The notes.db isn't enough -- I need the resources as well.
|
09-13-2024, 03:16 PM | #19 |
want to learn what I want
Posts: 1,416
Karma: 6874872
Join Date: Sep 2020
Device: none
|
sure I can
|
09-13-2024, 04:33 PM | #20 |
Grand Sorcerer
Posts: 12,171
Karma: 7908995
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Thanks.
I think I have found the bottleneck. Getting your metadata.db and the notes folder will let me be sure. NB: I don't need the books themselves. Last edited by chaley; 09-13-2024 at 05:13 PM. |
09-14-2024, 12:41 PM | #21 |
Grand Sorcerer
Posts: 12,171
Karma: 7908995
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Try this template. On my machine and using your library the search completes in 4 or 5 seconds.
Note that it is suitable to be used as a stored template with 2 arguments. I tested it as "books_with_notes_containing_text" with this as the actual template search. Code:
python: def evaluate(book, context): if context.arguments is None or len(context.arguments) != 2: # Set these to what you want field_name = 'authors' value_in_note = 'a' else: field_name = context.arguments[0] value_in_note = context.arguments[1] db = context.db.new_api # check if we have already cached the notes note_items = context.globals.get('items_with_notes', None) if note_items is not None: # We've already fetched which items have notes. # Get the cached search result values note_search_results = context.globals['note_search_results'] item_name_map = context.globals['item_name_map'] else: # First time. Get the items with notes and initialize # the search value cache. note_items = db.get_all_items_that_have_notes(field_name) context.globals['items_with_notes'] = note_items note_search_results = {} context.globals['note_search_results'] = note_search_results # db.get_item_id() uses a linear search. Avoid this by getting # and caching the map item_name_map = db.get_item_name_map(field_name) context.globals['item_name_map'] = item_name_map # Check if this book is a match -- that the field has a note containing # the desired text. # We must first get the item_id for the value in the field to be checked. field_value = book.get(field_name) if not field_value: return '' # if the field is multi-valued, use the first value if isinstance(field_value, list): if len(field_value) == 0: return '' field_value = field_value[0] # Now get the cached internal ID of the value in field_name item_id = item_name_map[field_value] # Does the item have a note? If not, give up now. if item_id not in note_items: return '' # The item has a note. Have we already checked it? if item_id not in note_search_results: # Item has a note but we haven't seen it before. Do the compare # on the plain text version of the note. result = '' # Get the note. note = db.notes_data_for(field_name, item_id) if note: # Get the plain text of the note. note = note['searchable_text'].partition('\n')[2] if note: # use a case insensitive compare to check if the search value is in the note from calibre.utils.icu import primary_contains result = 'Yes' if primary_contains(value_in_note, note) else '' # Cache the result of the comparison note_search_results[item_id] = result context.globals['note_search_results'] = note_search_results # Return the cached value return note_search_results.get(item_id, '') |
09-14-2024, 12:49 PM | #22 |
want to learn what I want
Posts: 1,416
Karma: 6874872
Join Date: Sep 2020
Device: none
|
yes that works
|
09-21-2024, 07:16 PM | #23 | |
Zealot
Posts: 108
Karma: 2029154
Join Date: Sep 2013
Location: Pacific Northwest
Device: iPad Mini, iPhone 12, Kindle Paperwhite 3
|
Quote:
One question, though: is there any way to modify that template to iterate over all the authors for a given title? That would be the absolute ideal solution for me. |
|
09-22-2024, 08:59 AM | #24 | |
Grand Sorcerer
Posts: 12,171
Karma: 7908995
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Code:
python: def evaluate(book, context): if context.arguments is None or len(context.arguments) != 2: # Set these to what you want field_name = 'authors' value_in_note = 'note' else: field_name = context.arguments[0] value_in_note = context.arguments[1] db = context.db.new_api # check if we have already cached the notes note_items = context.globals.get('items_with_notes', None) if note_items is not None: # We've already fetched which items have notes. # Get the cached search result values note_search_results = context.globals['note_search_results'] item_name_map = context.globals['item_name_map'] else: # First time. Get the items with notes and initialize # the search value cache. note_items = db.get_all_items_that_have_notes(field_name) context.globals['items_with_notes'] = note_items note_search_results = {} context.globals['note_search_results'] = note_search_results # db.get_item_id() uses a linear search. Avoid this by getting # and caching the map item_name_map = db.get_item_name_map(field_name) context.globals['item_name_map'] = item_name_map # Check if this book is a match -- that the field has a note containing # the desired text. # We must first get the item_id for the value in the field to be checked. field_values = book.get(field_name) if not field_values: return '' # We want to check every value in the item, so use a list. # If the given field is not multi-valued, turn it into a list if not isinstance(field_values, (tuple, list, set)): field_values = (field_values,) # Loop over the field values, checking each one. Stop on first success result = '' for field_value in field_values: # Get the cached internal ID of the value in field_name item_id = item_name_map[field_value] # Does the item have a note? If not, give up now. if item_id not in note_items: continue # The item has a note. Have we already checked it? result = note_search_results.get(item_id) if result is not None: # We've already checked this item. if result: # It matched. Break out of the loop break # It didn't match. Check the next item continue # Item has a note but we haven't seen it before. Do the compare # on the plain text version of the note. # Get the note. note = db.notes_data_for(field_name, item_id) if note: # Get the plain text of the note. note = note['searchable_text'].partition('\n')[2] if note: # use a case insensitive compare to check if the search value is in the note from calibre.utils.icu import primary_contains result = 'Yes' if primary_contains(value_in_note, note) else '' # Cache the result of the comparison note_search_results[item_id] = result if result: break # Cache the updated results context.globals['note_search_results'] = note_search_results return result Last edited by chaley; 09-29-2024 at 06:39 AM. Reason: Minor correction to the template |
|
09-28-2024, 10:24 PM | #25 |
Zealot
Posts: 108
Karma: 2029154
Join Date: Sep 2013
Location: Pacific Northwest
Device: iPad Mini, iPhone 12, Kindle Paperwhite 3
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Template Search: Exact matching | ownedbycats | Library Management | 3 | 04-03-2022 06:01 PM |
Template: Converting a search & replace into a template | ownedbycats | Library Management | 11 | 03-26-2021 05:32 AM |
Nova Pro : Notes Template | MachinaCarnis | Onyx Boox | 0 | 01-30-2020 02:57 PM |
template or search feature question | bulldogmo | Calibre | 2 | 08-06-2014 07:34 PM |