06-12-2010, 11:37 AM | #61 |
Member
Posts: 13
Karma: 954
Join Date: Jun 2010
Device: Mobipocket reader on Blackberry, XO using FBreader, Kindle
|
|
06-12-2010, 12:43 PM | #62 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
Advert | |
|
06-22-2010, 03:02 AM | #63 |
Reader
Posts: 13
Karma: 6184
Join Date: Jun 2009
Location: Celebration, FL
Device: iPad
|
In trying to experiment with regex for the file name, I wiped my default string.
I am now just trying to get back to "author - title" but, on the OSX 0.7.4 Calibre I can't get any regex to pass the test nor to actually work with real file names. I have tried lots of examples from this thread and cannot get a single one with parse anything. Any ideas what I might be doing wrong? |
06-22-2010, 03:52 AM | #64 |
US Navy, Retired
Posts: 9,878
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
I seem to recall hearing someone say they needed the file extension in the test string, .epub or .pdf to get the test to work.
|
06-24-2010, 04:47 PM | #65 |
Reader
Posts: 13
Karma: 6184
Join Date: Jun 2009
Location: Celebration, FL
Device: iPad
|
Thanks - that was the magic info I needed!
Phil |
Advert | |
|
07-20-2010, 03:41 AM | #66 |
Junior Member
Posts: 2
Karma: 10
Join Date: Jul 2010
Device: iPod Touch
|
Hello All. I could use some help. I'm very new to Regular Expressions...
I have file names in one of three formats: Orson Scott Card - Alvin Maker, Book #01 - Seventh Son.txt Orson Scott Card - Alvin Maker 01 - Seventh Son.txt Orson Scott Card - Seventh Son.txt Is there a way to extract the relevant information from all three formats with one RegExp? Thanks for any help! Barak |
07-20-2010, 05:06 AM | #67 |
Not who you think I am...
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
|
(?P<author>.*?)(\s\-\s)((?P<series>.*?)(\,\s)?(Book \#)?(?P<seriesindex>\d\d)(\s\-\s)(?P<title>.*?))|(?P<title>.*?)(\.txt)
Should pull it, but try it on some test cases first... it might fail on titles with numbers in them. |
07-20-2010, 09:42 AM | #68 |
Junior Member
Posts: 2
Karma: 10
Join Date: Jul 2010
Device: iPod Touch
|
Thanks for the try... unfortunately Calibre complainis that that is an invalid RexEx...
"ERROR: Invalid regular expression: redefinition of group name u'title' as group 10; was group 9" EDIT: The following matches my first two formats, but bombs on the simple "Author Name - Book Title.ext" format. (?P<author>[^_]+) - (?P<series>.+)( |(, Book #))(?P<series_index>[0-9]+) - (?P<title>.+) 2nd EDIT: Re-reading the entire thread I found GinoAMelone's post and added to it the specific bit I need to sort out the ",Book #" part and now it does everything I need it to! ^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?((?P<series>[^0-9\-]+)(\s*-\s*)?( |(, Book #))(?P<series_index>[0-9.]+)\s*-\s*)?(?P<title>[^\-_0-9]+) Which matches the following: Series Name 1 - Book Title.txt Author - Series Name 1 - Book Title.txt Author - Series Name, Book #1 - Book Title.txt Author - Series Name - 1 - Book Title.txt Author - Book Title.txt Series Name - 1 - Book Title.txt Last edited by Barak; 07-20-2010 at 01:31 PM. Reason: Followup |
07-20-2010, 06:24 PM | #69 |
Not who you think I am...
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
|
Guess the "or" function doesn't exclude one of the two <title>s.
Glad you figured it out. |
09-17-2010, 12:29 PM | #70 |
Enthusiast
Posts: 32
Karma: 44
Join Date: Jul 2010
Location: Seneca, SC
Device: Kindle, eReader
|
This regular expression:
[[](?P<series>.+)[]] (?P<title>.+) - (?P<author>[^_]+) tests perfectly against this filename: [Instrumentality Of Mankind] Golden the Ship Was Oh! Oh! Oh! - Cordwainer Smith.epub title = Golden the Ship Was Oh! Oh! Oh! author = Cordwainer Smith series = Instrumentality Of Mankind but when adding the file, it creates title = [Instrumentality Of Mankind] Golden the Ship Was Oh! Oh! Oh! author = Cordwainer Smith series = Instrumentality Of Mankind Appreciate any help. Thanks! |
09-17-2010, 12:57 PM | #71 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
|
|
09-17-2010, 01:03 PM | #72 |
Enthusiast
Posts: 32
Karma: 44
Join Date: Jul 2010
Location: Seneca, SC
Device: Kindle, eReader
|
|
09-17-2010, 02:26 PM | #73 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Yes!
FYI: Metadata during an Add Book operation comes from: 1) First Calibre looks for an .opf file with the same name as the format being added. It grabs that in preference to other settings. 2) Then it goes for internal or regex/filename metadata according to the setting Preferences|Import/Export|Adding|"Read metadata from file contents rather than file name" 3) Finally, if it fails in #2 , it creates its own according to the filename and a default regex. |
01-10-2011, 08:49 AM | #74 |
Junior Member
Posts: 2
Karma: 10
Join Date: Jan 2011
Device: none
|
Hi everyone! LOVE this software, absolutely brilliant!
My brain is flawed though, really struggling to find and work out some incredibly basic info. I am trying to remove (xx) where xx is various numbers up to several hundred. (In the Title field). so I am just using a basic bulk edit search and replace. My issue is trying to wildcard the numbers - I am missing some very basic bit of knowledge as to how to do it. I have tried: (**), (??), ([0-9][0-9]) etcetc! As you can see I am just blindly flailing around chucking anything in there! Really sorry for the basic basic question, if anyone has two minutes to let me know I would hugely appreciate it. Call me any names you wish Cheers Lily |
01-10-2011, 08:58 AM | #75 |
Wizard
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
An easier way to do this is to use '\d*' to match a number of any length. Another useful one along these lines is '\s*' to match any number of white space characters.
|
Tags |
regex, regular expressions |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regular Expression Help | smartmart | Calibre | 5 | 10-17-2010 06:19 AM |
Need Help Creating a Regular Expression | Worm | Calibre | 9 | 08-18-2010 02:20 PM |
Regular Expression Help Needed | dloyer4 | Calibre | 1 | 07-25-2010 11:37 PM |
Help with the regular expression | Dysonco | Calibre | 9 | 03-22-2010 11:45 PM |
I don't know how to use wilcards and regular expression.... | superanima | Sigil | 4 | 02-21-2010 10:42 AM |