![]() |
#31 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I made a simple fix and submitted it as Ticket #4620. It splits the author's name at a comma, if there is one (with swap checkbox selected) and otherwise splits at the first space instead of the last space to deal with middle names and initials.
|
![]() |
![]() |
#32 |
Banned
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
|
Nice!
Hope it's accepted. m a r |
![]() |
Advert | |
|
![]() |
#33 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
According to bug tracker it will be in the next release. It's a trivial change, and, I suspect it takes Kovid longer to look at the proposed change and make sure I haven't buggered up his code than to do it himself. Nonetheless, it's nice to feel like I've made a tiny contribution. Plus, it does address a problem that was bothering me.
Now I'm off to read some more on Python. There's this other tiny issue that bugs me ... |
![]() |
![]() |
#34 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: May 2010
Device: ipad
|
Calibre - ebook import Metadata - publisher, date
Hi there,
I've just started to use Calibre and so far, I think I like it. Many thanks to Kovind for the wonderful work. My pdf ebook/document collection was formatted as title - author - publisher - date - series.pdf. I failed to locate any help in the forum on how to import the publisher and date into the metadata. I was wondering if any person can help on this matter. Thanks. |
![]() |
![]() |
#35 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
publisher: ?P<publisher> published date: ?P<pubdate> entered date: ?P<timestamp> I've never tried any of those, and they aren't in the regex test, so they may not work. If they don't, put in an enhancement request. |
|
![]() |
Advert | |
|
![]() |
#36 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2010
Device: Nook
|
I need to come up with a regex to detect and remove page numbers from the bottom of PDF pages to convert to Epub for nook usage. The page numbers translate over as bolded, with a paragraph break after them. The HTML code I'd like to remove is (page numbers indicated below by ###)
<b>Page ###</b></p><p> Thanks for the help. |
![]() |
![]() |
#37 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Code:
<b>Page /d+.*<p> |
|
![]() |
![]() |
#38 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2010
Device: Nook
|
Code:
<b>Page /d+.*<p> Code:
<b>Page 1</b></p><p> |
![]() |
![]() |
#39 |
Connoisseur
![]() Posts: 55
Karma: 10
Join Date: Jan 2010
Device: Nexus One
|
It's a bit messier than Starson17's solution, but if that's not working, try this:
Code:
<b>Page [0-9]{1,3}</b></p><p> But Starson17's regex should have matched.. make sure all the tags surrounding the page number are correct. Maybe copy/paste an actual example, and then replace the middle of it with the regex, to be sure you didn't miss a space or something. |
![]() |
![]() |
#40 |
Connoisseur
![]() Posts: 58
Karma: 12
Join Date: Jan 2009
Device: none
|
<b>Page [0-9]{1,4}</b></p><p>
Should remove it. Of course, this may not work from calibre. I use sigil to remove those sorts of things. Last edited by darkmonk; 05-24-2010 at 11:48 PM. Reason: edit: I see this is pointless, as I was beaten by two minutes. |
![]() |
![]() |
#41 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,812
Karma: 25490602
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@vinco
you need a backslash not a forward slash in <b>Page /d+.*<p> |
![]() |
![]() |
#42 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2010
Device: Nook
|
Still confused.
Code:
<b>Page \d+</b></p><p> I'm converting from a PDF, to EPUB. A sample of the XML generated from the PDF is below. Code:
Since nothing material was destroyed when the Eddorians were forced into the next plane of existence, their historical records also have become available. Those records-folios and tapes and playable discs of platinum alloy, resistant indefinitely even to Eddore's noxious atmosphere agree with those of the Arisians upon this point. Immediately before the Coalescence began there was one, and only <b>Page 1</b></p><p> one, planetary solar system in the Second Galaxy; and, until the advent of Eddore, the Second Galaxy was entirely devoid of intelligent life. </p><p> Last edited by vinco; 05-25-2010 at 01:59 AM. |
![]() |
![]() |
#43 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
#44 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,220
Karma: 7955067
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
In your example, there are two spaces between 'Page' and '1'. If that is an accurate copy, then you need to match more than 1 space there. Try Code:
<b>Page +\d+</b></p><p> Code:
<b>\s*Page +\d+\s*</b>\s*</p>\s*<p> |
|
![]() |
![]() |
#45 |
Member
![]() Posts: 19
Karma: 54
Join Date: Feb 2010
Location: San Francisco, CA
Device: Nook
|
Guys,
Name Changer is a great tool for fixing filenames in a directory before you import into Calibre. Works on a Mac. |
![]() |
![]() |
Tags |
regex, regular expressions |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regular Expression Help | smartmart | Calibre | 5 | 10-17-2010 06:19 AM |
Need Help Creating a Regular Expression | Worm | Calibre | 9 | 08-18-2010 02:20 PM |
Regular Expression Help Needed | dloyer4 | Calibre | 1 | 07-25-2010 11:37 PM |
Help with the regular expression | Dysonco | Calibre | 9 | 03-22-2010 11:45 PM |
I don't know how to use wilcards and regular expression.... | superanima | Sigil | 4 | 02-21-2010 10:42 AM |