Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 08-23-2014, 08:24 PM   #1
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Need regex help, please

I have an ePub file in which single smart quotes are used to open and close every quotation. I would like to use Sigil to change all the single smart quotes to double smart quotes.

Changing the open quotes is a simple normal search and replace, but a problem arises in changing the end quotes. I'm sure there must be a regex expression that would fix things but the problem is that every apostrophe seemingly would also be affected.

For example, in code view: ‘I need help,’ O’Malley answered.</span></p>

Any expression that would find the single quote marks throughout the file would also find the apostrophe in all the words like O’Malley. And, of course, checking each find/replace to see if it's a quote or an apostrophe would take forever.

I wrote this regex [^A-Za-z]’[^A-Za-z] but when I do a search, the expression finds the smart single end quote but it also captures the end-of-sentence punctuation mark and the opening of the span tag, thus: .’<

So when I use the smart double end quote in the replacement field, the end quote is correctly replaced but the period (or any other punctuation) and the open tag < are deleted.

I'm barely literate in regex, so I hope this makes sense. Any help in writing a regex that accomplishes what I'd like to do would be greatly appreciated. Thanks!
magmanpi is offline   Reply With Quote
Old 08-23-2014, 09:18 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,110
Karma: 57259780
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Anything in the search is captured is normally deleted and simply needs to be replaced as part of the replace term
theducks is online now   Reply With Quote
Advert
Old 08-23-2014, 09:23 PM   #3
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,770
Karma: 27405072
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I would have assumed Sigil would have had a button/tool to do that, but it appears not.

FWIW 1 - The calibre book editor has a button (Smart Punctuation) to do it. And Diap Dealer is developing an Even Smarter Punctuation plugin for the calibre book editor. You do not have to use the calibre library manager to use the book editor, it can be used stand alone.

FWIW 2 - I use both the Sigil and the Calibre editors.

BR
BetterRed is offline   Reply With Quote
Old 08-23-2014, 09:58 PM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,696
Karma: 196509000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
But if they're already single smart quotes, neither calibre's smartening routines nor my editor plugin will help the OP convert single smart quotes to double smart quotes. Neither will regex quite frankly--not a one-size fits all Replace All solution, anyway. It would be multiple passes and stepping through stuff one-by-one to make sure everything went right. An algorithm of some kind would be better suited for that kind of wholesale conversion (and even that would probably never be 100%).

If I was forced to do this using regex only, I'd try to change all apostrophes to some weird string with something like (\pL)’(\pL) replaced with \1~apos~\2 (I'd still need to look for plural possessive apostrophes, and words like ’tis and such). Then once I was satisfied that I'd protected all apostrophes by mangling them into a unique string, it should be relatively simple to replace the opening and closing single smart-quotes with their double smart-quote counterparts. With that done, I could go back and unmangle my apostrophes: replacing ~apos~ with ’ (or an entity).

Mostly though, I probably wouldn't bother.
DiapDealer is online now   Reply With Quote
Old 08-23-2014, 10:33 PM   #5
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,770
Karma: 27405072
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by DiapDealer View Post
But if they're already single smart quotes, neither calibre's smartening routines nor my editor plugin will help...
I overlooked that five letter word

BR
BetterRed is offline   Reply With Quote
Advert
Old 08-23-2014, 10:41 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,696
Karma: 196509000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by BetterRed View Post
I overlooked that five letter word

BR
Happens to the best of us.
DiapDealer is online now   Reply With Quote
Old 08-24-2014, 08:52 AM   #7
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Thanks for all the help, everyone. I think I'm leaning toward the path of least resistance, the "don't bother" solution, though I'm tempted to try DiapDealer's suggestion of protecting all the apostrophes by changing them to some weird string. Hmm, another way to do it might be normal searches that replace close quotes with all punctuation permutations:

.' with ."
!' with !"
,' with ,"

and so on.
magmanpi is offline   Reply With Quote
Old 08-24-2014, 09:26 AM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,696
Karma: 196509000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Definitely more than one way to tackle it. But I would think that eliminating ’ used as an apostrophe might be less tedious if done first. Once the ’ preceded and followed by a letter were eliminated, I would think a search for single closing smart quotes that weren't followed by punctuation would cover the bulk of the special-case plural possessives, and ’Tis, and argot-like ’em and ’im (for them and him). There's always going to be the possibility that <span> tags might interfere with the detection of ’ followed by punctuation, but that's always going to be the case no matter how you tackle it. I would think, though, that getting a document into state where the ‘ and ’ represented only opening and closing dialog quotations wouldn't be an impossibly daunting task--if I was motivated enough to want to do it.
DiapDealer is online now   Reply With Quote
Old 08-24-2014, 10:34 AM   #9
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,522
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I have been known for going through all quote marks and apostrophes in a 1000-page book, one by one, replacing them with the appropriate curly variant, and distinguishing between right single quote and apostrophe. With some preparatory regexp and a couple of single-key macros, it's not too hard
Jellby is offline   Reply With Quote
Old 08-24-2014, 02:42 PM   #10
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,420
Karma: 74317824
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
Quote:
Originally Posted by Jellby View Post
I have been known for going through all quote marks and apostrophes in a 1000-page book, one by one, replacing them with the appropriate curly variant, and distinguishing between right single quote and apostrophe. With some preparatory regexp and a couple of single-key macros, it's not too hard
I wonder if one could safely say that a quotation mark followed by a letter was an opening one, and one directly following a letter or punctuation mark was a closing one?
PeterT is offline   Reply With Quote
Old 08-24-2014, 02:56 PM   #11
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,522
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by PeterT View Post
I wonder if one could safely say that a quotation mark followed by a letter was an opening one, and one directly following a letter or punctuation mark was a closing one?
If it's a single quote followed by a letter, not at all, it may be an apostrophe. In other cases, it depends on how much you trust the source, I have often found quotes at the wrong side of a space...
Jellby is offline   Reply With Quote
Old 08-25-2014, 01:20 PM   #12
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Well, I made the smart quote fixes using normal searches that replaced close quotes with all punctuation permutations:

.' with ."
!' with !"
,' with ,"

and so on.

I also searched for punctuation like ellipses and hyphens and em dashes that were followed by a single smart close quote (with and without preceding and/or following spaces), replacing them with a double smart end quote. It really didn't take very long to run all the different searches and I'm quite pleased with the result.

My thanks to all who offered their help!
magmanpi is offline   Reply With Quote
Old 08-29-2014, 04:16 AM   #13
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
Search:
Code:
(\s)‘(.*?[^a-z\s])’(\s)
Replace:
Code:
$1&ldquo;$2&rdquo;$3
Try that. It's a one-pass search. The key is the set of non-letter characters (? ! . - ...) that appear at the end in a normal sentence that has quotes. Now with O'Malley having a quote between two capitalized letters, you may need to change the search slightly --

Search:
Code:
(\s)‘(.*?[^a-zA-Z\s])’(\s)
I used named entities in the replace, but you can change those as you like. I can't remember if Sigil needs to use $1 or \1 for replacement.

Test carefully.

Aloha.
capidamonte is offline   Reply With Quote
Old 08-29-2014, 04:34 AM   #14
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,522
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
In any case you cannot have a single infallible regex. You need to understand the words in order to know which "’" is the right closing quote in things like this:

They played ‘Stompin’ at the Savoy’ at Vinicius’ yesterday.
Jellby is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex? weberr Editor 3 05-12-2014 09:06 PM
Regex help please bremler Workshop 10 04-24-2014 09:46 PM
Regex help anyone? seanos Editor 17 04-02-2014 11:03 AM
Need help with a regex mobiuser Workshop 15 01-19-2014 05:57 PM
What a regex is Worldwalker Calibre 20 05-10-2010 05:51 AM


All times are GMT -4. The time now is 12:38 PM.


MobileRead.com is a privately owned, operated and funded community.