Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 06-12-2010, 11:37 AM   #61
TreborPugly
Member
TreborPugly has learned how to read e-booksTreborPugly has learned how to read e-booksTreborPugly has learned how to read e-booksTreborPugly has learned how to read e-booksTreborPugly has learned how to read e-booksTreborPugly has learned how to read e-booksTreborPugly has learned how to read e-booksTreborPugly has learned how to read e-books
 
Posts: 13
Karma: 954
Join Date: Jun 2010
Device: Mobipocket reader on Blackberry, XO using FBreader, Kindle
Quote:
Originally Posted by Starson17 View Post
Try this:
Code:
^((?P<author>([^\_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?((?P<series>[^0-9\-]+) ([-#] ?)?(?P<series_index>[0-9.]+)?\s*-\s*)?(?P<title>[^(]+)
It excludes the open paren from the title.

Thank you muchly. That does the trick.
TreborPugly is offline  
Old 06-12-2010, 12:43 PM   #62
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TreborPugly View Post
Thank you muchly. That does the trick.
You're welcome. Studying the other parts of that regex is instructive, if you're interested - particularly the positive and negative look-ahead assertions (?= and ?!) and the or construct (|) you asked about.
Starson17 is offline  
Advert
Old 06-22-2010, 03:02 AM   #63
Mythlandia
Reader
Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.
 
Posts: 13
Karma: 6184
Join Date: Jun 2009
Location: Celebration, FL
Device: iPad
In trying to experiment with regex for the file name, I wiped my default string.
I am now just trying to get back to "author - title" but, on the OSX 0.7.4 Calibre I can't get any regex to pass the test nor to actually work with real file names. I have tried lots of examples from this thread and cannot get a single one with parse anything.
Any ideas what I might be doing wrong?
Mythlandia is offline  
Old 06-22-2010, 03:52 AM   #64
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,878
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by Mythlandia View Post
I can't get any regex to pass the test nor to actually work with real file names. I have tried lots of examples from this thread and cannot get a single one with parse anything.
I seem to recall hearing someone say they needed the file extension in the test string, .epub or .pdf to get the test to work.
DoctorOhh is offline  
Old 06-24-2010, 04:47 PM   #65
Mythlandia
Reader
Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.Mythlandia got an A in P-Chem.
 
Posts: 13
Karma: 6184
Join Date: Jun 2009
Location: Celebration, FL
Device: iPad
Thanks - that was the magic info I needed!

Phil
Mythlandia is offline  
Advert
Old 07-20-2010, 03:41 AM   #66
Barak
Junior Member
Barak began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jul 2010
Device: iPod Touch
Hello All. I could use some help. I'm very new to Regular Expressions...

I have file names in one of three formats:

Orson Scott Card - Alvin Maker, Book #01 - Seventh Son.txt
Orson Scott Card - Alvin Maker 01 - Seventh Son.txt
Orson Scott Card - Seventh Son.txt

Is there a way to extract the relevant information from all three formats with one RegExp?

Thanks for any help!

Barak
Barak is offline  
Old 07-20-2010, 05:06 AM   #67
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
(?P<author>.*?)(\s\-\s)((?P<series>.*?)(\,\s)?(Book \#)?(?P<seriesindex>\d\d)(\s\-\s)(?P<title>.*?))|(?P<title>.*?)(\.txt)

Should pull it, but try it on some test cases first... it might fail on titles with numbers in them.
capidamonte is offline  
Old 07-20-2010, 09:42 AM   #68
Barak
Junior Member
Barak began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jul 2010
Device: iPod Touch
Thanks for the try... unfortunately Calibre complainis that that is an invalid RexEx...

"ERROR: Invalid regular expression: redefinition of group name u'title' as group 10; was group 9"

EDIT: The following matches my first two formats, but bombs on the simple "Author Name - Book Title.ext" format.

(?P<author>[^_]+) - (?P<series>.+)( |(, Book #))(?P<series_index>[0-9]+) - (?P<title>.+)

2nd EDIT: Re-reading the entire thread I found GinoAMelone's post and added to it the specific bit I need to sort out the ",Book #" part and now it does everything I need it to!

^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?((?P<series>[^0-9\-]+)(\s*-\s*)?( |(, Book #))(?P<series_index>[0-9.]+)\s*-\s*)?(?P<title>[^\-_0-9]+)

Which matches the following:
Series Name 1 - Book Title.txt
Author - Series Name 1 - Book Title.txt
Author - Series Name, Book #1 - Book Title.txt
Author - Series Name - 1 - Book Title.txt
Author - Book Title.txt
Series Name - 1 - Book Title.txt

Last edited by Barak; 07-20-2010 at 01:31 PM. Reason: Followup
Barak is offline  
Old 07-20-2010, 06:24 PM   #69
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
Guess the "or" function doesn't exclude one of the two <title>s.

Glad you figured it out.
capidamonte is offline  
Old 09-17-2010, 12:29 PM   #70
pckopp
Enthusiast
pckopp began at the beginning.
 
Posts: 32
Karma: 44
Join Date: Jul 2010
Location: Seneca, SC
Device: Kindle, eReader
This regular expression:

[[](?P<series>.+)[]] (?P<title>.+) - (?P<author>[^_]+)

tests perfectly against this filename:

[Instrumentality Of Mankind] Golden the Ship Was Oh! Oh! Oh! - Cordwainer Smith.epub

title = Golden the Ship Was Oh! Oh! Oh!
author = Cordwainer Smith
series = Instrumentality Of Mankind

but when adding the file, it creates

title = [Instrumentality Of Mankind] Golden the Ship Was Oh! Oh! Oh!
author = Cordwainer Smith
series = Instrumentality Of Mankind

Appreciate any help. Thanks!
pckopp is offline  
Old 09-17-2010, 12:57 PM   #71
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by pckopp View Post
This regular expression:

[[](?P<series>.+)[]] (?P<title>.+) - (?P<author>[^_]+)

tests perfectly against this filename:

[Instrumentality Of Mankind] Golden the Ship Was Oh! Oh! Oh! - Cordwainer Smith.epub

title = Golden the Ship Was Oh! Oh! Oh!
author = Cordwainer Smith
series = Instrumentality Of Mankind

but when adding the file, it creates

title = [Instrumentality Of Mankind] Golden the Ship Was Oh! Oh! Oh!
author = Cordwainer Smith
series = Instrumentality Of Mankind

Appreciate any help. Thanks!
It works for me. Turn off Preferences|Import/Export|Adding|"Read metadata from file contents rather than file name"
Starson17 is offline  
Old 09-17-2010, 01:03 PM   #72
pckopp
Enthusiast
pckopp began at the beginning.
 
Posts: 32
Karma: 44
Join Date: Jul 2010
Location: Seneca, SC
Device: Kindle, eReader
Quote:
Originally Posted by Starson17 View Post
It works for me. Turn off Preferences|Import/Export|Adding|"Read metadata from file contents rather than file name"
Thank you! Works for me, now.

So many buttons!
pckopp is offline  
Old 09-17-2010, 02:26 PM   #73
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by pckopp View Post
Thank you! Works for me, now.

So many buttons!
Yes!
FYI: Metadata during an Add Book operation comes from:
1) First Calibre looks for an .opf file with the same name as the format being added. It grabs that in preference to other settings.
2) Then it goes for internal or regex/filename metadata according to the setting Preferences|Import/Export|Adding|"Read metadata from file contents rather than file name"
3) Finally, if it fails in #2 , it creates its own according to the filename and a default regex.
Starson17 is offline  
Old 01-10-2011, 08:49 AM   #74
flamelily
Junior Member
flamelily began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2011
Device: none
Hi everyone! LOVE this software, absolutely brilliant!
My brain is flawed though, really struggling to find and work out some incredibly basic info.
I am trying to remove (xx) where xx is various numbers up to several hundred. (In the Title field).
so I am just using a basic bulk edit search and replace. My issue is trying to wildcard the numbers - I am missing some very basic bit of knowledge as to how to do it.
I have tried:
(**), (??), ([0-9][0-9])
etcetc!
As you can see I am just blindly flailing around chucking anything in there!

Really sorry for the basic basic question, if anyone has two minutes to let me know I would hugely appreciate it. Call me any names you wish

Cheers
Lily
flamelily is offline  
Old 01-10-2011, 08:58 AM   #75
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
An easier way to do this is to use '\d*' to match a number of any length. Another useful one along these lines is '\s*' to match any number of white space characters.
itimpi is offline  
Closed Thread

Tags
regex, regular expressions


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression Help smartmart Calibre 5 10-17-2010 06:19 AM
Need Help Creating a Regular Expression Worm Calibre 9 08-18-2010 02:20 PM
Regular Expression Help Needed dloyer4 Calibre 1 07-25-2010 11:37 PM
Help with the regular expression Dysonco Calibre 9 03-22-2010 11:45 PM
I don't know how to use wilcards and regular expression.... superanima Sigil 4 02-21-2010 10:42 AM


All times are GMT -4. The time now is 11:30 PM.


MobileRead.com is a privately owned, operated and funded community.