Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 01-06-2023, 12:56 PM   #1
DaltonST
Deviser
DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.
 
DaltonST's Avatar
 
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
[GUI Plugin] Extract People & Other Metadata

[GUI Plugin] Extract People & Other Metadata

Summary: EPOM uses your personally-constructed 'Python Regular Expressions' to search the text of an ebook and extract metadata from it.

Documentation: The EPOM 'User Guide' is comprised of all of its ToolTips plus any images and related files attached below.

Requires Minimum Calibre Version: 6.11.0

Other Useful Calibre Plugins to Consider:
  • [GUI Plugin] CalibreSpy
  • [GUI Plugin] View Manager

Version History:

Spoiler:

Version 1.0.0 - 2023-01-06 Initial release.
Attached Thumbnails
Click image for larger version

Name:	epom_overview_page1.jpg
Views:	1050
Size:	1.44 MB
ID:	198846   Click image for larger version

Name:	epom_overview_page2.jpg
Views:	701
Size:	1.21 MB
ID:	198847  
Attached Files
File Type: zip extract_people_other_metadata.zip (48.2 KB, 32967 views)

Last edited by DaltonST; 01-07-2023 at 12:18 PM.
DaltonST is offline   Reply With Quote
Old 01-06-2023, 12:57 PM   #2
DaltonST
Deviser
DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.
 
DaltonST's Avatar
 
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
For Future Use

For Future Use
DaltonST is offline   Reply With Quote
Advert
Old 05-23-2023, 09:02 AM   #3
arpeggioaccele
light mode user
arpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcover
 
Posts: 66
Karma: 16268
Join Date: May 2023
Location: New England
Device: I use the Calibre ebook-viewer on macos and Apple Books on ios.
Great plugin!

I installed Extract People & Other Metadata through calibre user plugins and used it to extract Ao3 links from the epubs I downloaded, though I'm not sure if I did it the best way. (I have been intentionally avoiding fanficfare) Thought it would be good to share if anyone else needed my janky solution.

I couldn't figure out how to remove a set string at the beginning and end of the text so here's the solution I made.

I made a custom column named link.
I set up three #link custom column extractors:
To identify work links I used the keyword "Posted originally on the.+$" I wanted to make I got the right link, sometimes there are other work links in the file.
I filtered the result to just the work id number and s/ "s\/\d+"
Then I in tweaks I added
Code:
REMOVE_CHARACTERS=.<>[]()
CALIBRE_TEMPLATE_LANGUAGE_BUILTIN=re(1, 'Posted originally on the Archive of Our Own at ', '')
CALIBRE_TEMPLATE_LANGUAGE_BUILTIN=re(1, 'Posted originally on the Archive of Our Ownhttp://archiveofourownorg/ at ', '')
CALIBRE_TEMPLATE_LANGUAGE_BUILTIN=re(1, 's/','http://archiveofourown.org/works/')
which removes the extraneous period and original keyword and adds back the website url after by replacing the /s. When you don't have a fulltext index, the raw html needs to be filtered differently.

To identify series links it was easier just assume there was only one and use the keyword "http:\/\/archiveofourown.org\/series\/.+$"
There was no period after to remove, but I filtered for the properly formatted url just in case.
I only turned this extractor on for works I combined with epubmerge.

For fanfiction.net works that I downloaded with ficlab I used the keyword "based on content retrieved from.+$"
I filtered for the story id and s/ "s\/\d+"
I replaced s/ with the website url after removing extraneous characters.
Code:
REMOVE_CHARACTERS=.
CALIBRE_TEMPLATE_LANGUAGE_BUILTIN=re(1, 'based on content retrieved from ', '')
CALIBRE_TEMPLATE_LANGUAGE_BUILTIN=re(1, 's/', 'https://www.fanfiction.net/s/')
And now I have a link in book details that I cannot click as a link. I probably should have made it an identifier or something lol, but it works for my purpose of exporting to a spreadsheet.
arpeggioaccele is offline   Reply With Quote
Old 05-23-2023, 05:57 PM   #4
arpeggioaccele
light mode user
arpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcoverarpeggioaccele exercises by bench pressing the entire Harry Potter series in hardcover
 
Posts: 66
Karma: 16268
Join Date: May 2023
Location: New England
Device: I use the Calibre ebook-viewer on macos and Apple Books on ios.
I now realize this can be done through fanficfare or mass search and replace to find urls in books grab metadata, and it correctly make the url an identifier... oops.
https://www.mobileread.com/forums/sh...ntifiers+links
https://www.mobileread.com/forums/sh...nk#post4320727
https://www.mobileread.com/forums/sh...ntifiers+links
arpeggioaccele is offline   Reply With Quote
Old 09-14-2023, 11:39 PM   #5
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,284
Karma: 6433040
Join Date: Sep 2020
Device: Calibre E-book viewer
I wish I had tried this amazing plugin before using the Job Spy similar tool!
Comfy.n is offline   Reply With Quote
Advert
Old 01-06-2024, 01:09 AM   #6
danmeian
Member
danmeian began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Jul 2021
Device: Windows 11
This is an awesome tool. Is there a way to bulk extract info from multiple ebooks, instead of doing one at a time?
If it's necessary to do 1 at a time, is there a way to avoid the result screen popping up (where it shows the updated book, and you need to click back to the home screen, then do another search to get to where you originally was)?
Thank you!
danmeian is offline   Reply With Quote
Old 03-11-2024, 03:26 PM   #7
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,284
Karma: 6433040
Join Date: Sep 2020
Device: Calibre E-book viewer
edit: it's ok now

Last edited by Comfy.n; 03-11-2024 at 04:31 PM.
Comfy.n is offline   Reply With Quote
Old 06-24-2024, 06:53 AM   #8
danmeian
Member
danmeian began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Jul 2021
Device: Windows 11
@Comfy.n: Awesome, tysm!! This has been super useful. A small issue with this is that when I do multiple books without using FTS index, the result pops up 1 by 1. Consequentially, only the last extracted book get marked. This makes it a bit inconvenient when going back and forth between different search pages. This aside, the function definitely works.

Question: Is there anyway I can use this to extract the first 3-4 paragraphs from an epub?

Background info: I'm trying to generate a cover from the first page of an epub. Calibre's default "set cover from book" doesn't seem to work too well, so my plan is to
1. Use EPOM to extract the first 3-4 paragraphs from the epub (into custom column #first_pars)
2. Use Generate Cover to create a cover with text from #first_pars
danmeian is offline   Reply With Quote
Old 06-24-2024, 08:44 PM   #9
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,284
Karma: 6433040
Join Date: Sep 2020
Device: Calibre E-book viewer
Well I've used EPOM just for extracting translators and original titles, using the FTS index. That was not too challenging. In your case it would be better if Dalton could help, but it's been almost a year he's away from MR, unfortunately. Or maybe some regex power user.

I don't see an easy way to detect the exact beginning of the text, given the ebooks' structure variations, however you could try something like this

- set the tweak MAXIMUM_LENGTH_TO_ACCEPT= to a large value
- then populate the #first-pars column using regex to match, say, the first 1000 chars in the book
Comfy.n is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Extract ISBN kiwidude Plugins 545 09-25-2024 04:02 AM
[GUI Plugin] ePub Extended Metadata un_pogaz Plugins 20 08-10-2024 06:48 PM
[GUI plugin] Extract tables of contents Phssthpok Plugins 3 02-11-2024 08:47 AM
[GUI Plugin] Zotero Metadata Importer DaltonST Plugins 291 08-07-2023 01:38 PM
[GUI Plugin] Clean Metadata WS64 Plugins 28 01-06-2022 09:09 PM


All times are GMT -4. The time now is 05:57 AM.


MobileRead.com is a privately owned, operated and funded community.