Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old Today, 09:45 AM   #1
KarlG
Connoisseur
KarlG began at the beginning.
 
Posts: 55
Karma: 10
Join Date: Mar 2024
Device: none
Possible RegEx error in Sigil (minimal match being ignored)

Hello,

I'm trying to delete all 'span' tags with no content (i.e. <span blah blah>(nothing here)</span> by using the following RegEx (with minimal match enabled)

<span .*></span>

Here is some example text which shouldn't match anything

<hgroup><h2 class="CHAPTER" id="ch1"><span class="CN"><samp class="SANS_Futura_Std_Bold_Condensed_B_11">1</samp></span> <span class="CT"><samp class="SANS_Dogma_OT_Bold_B_11">WINDOWS FOUNDATIONAL CONCEPTS</samp></span></h2></hgroup>

As we can see, there are no span tags without content. Running this RegEx however matches the following text (as can be seen in the attached screenshot)
<span class="CN"><samp class="SANS_Futura_Std_Bold_Condensed_B_11">1</samp></span>

Is there anything I have missed, or is this indeed an error?

Rgds

Karl
Attached Thumbnails
Click image for larger version

Name:	regex error.png
Views:	21
Size:	22.3 KB
ID:	210250  
KarlG is offline   Reply With Quote
Old Today, 11:29 AM   #2
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,630
Karma: 23190435
Join Date: Dec 2010
Device: Kindle PW2
The .* in your regular expression also matches other tags. The following expression should work:

Code:
<span[^>]*></span>
Doitsu is offline   Reply With Quote
Advert
Old Today, 11:34 AM   #3
KarlG
Connoisseur
KarlG began at the beginning.
 
Posts: 55
Karma: 10
Join Date: Mar 2024
Device: none
Quote:
Originally Posted by Doitsu View Post
The .* in your regular expression also matches other tags. The following expression should work:

Code:
<span[^>]*></span>
But surely when 'minimal match' is selected, the search should also stop with the first '>'.?
KarlG is offline   Reply With Quote
Old Today, 01:02 PM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,784
Karma: 198099188
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Nope. Minimal Match can't affect the inherent greediness of *.

At least I don't recall it doing so in the past.

Last edited by DiapDealer; Today at 01:07 PM.
DiapDealer is offline   Reply With Quote
Old Today, 01:12 PM   #5
KarlG
Connoisseur
KarlG began at the beginning.
 
Posts: 55
Karma: 10
Join Date: Mar 2024
Device: none
Quote:
Originally Posted by DiapDealer View Post
Nope. Minimal Match can't affect the inherent greediness of *.

At least I don't recall it doing so in the past.
Hmm, maybe confusion on my part here then.

So what IS 'minimal match' for then, and what's the option (if there is one) for a non-greedy RegEx in Sigil?
KarlG is offline   Reply With Quote
Advert
Old Today, 01:29 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,784
Karma: 198099188
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Doitsu's regex was an example of a non-greedy search. But let me do some investigation. I don't really use the Minimal Match option, so there could be a problem with it. I don't want to be too nasty in dismissing what might be a bug.
DiapDealer is offline   Reply With Quote
Old Today, 01:38 PM   #7
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,957
Karma: 5449552
Join Date: Nov 2009
Device: many
And no the text will only stop at the first ">" when a match is found, without a match it will continue to grow the search area until it finds the first match or none at all. That is what the "minimal match" flag means. It finds the minimal length match if one exists.

Last edited by KevinH; Today at 01:52 PM.
KevinH is online now   Reply With Quote
Old Today, 02:40 PM   #8
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 38,542
Karma: 152905840
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Personally, I find it easier to use @DiapDealer's TagMechanic plugin for this type of task. Saves me from the issues when my fat fingers cause a typo,
DNSB is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
match an empty line with a regex? lumpynose Sigil 5 05-29-2019 03:03 AM
REGEX match everything before # JLius ePub 2 01-08-2017 04:25 PM
[Regex Search] Minimal match not possible? nqk Editor 7 12-24-2014 03:19 AM
how to have regex dot match any character including newline? gnychis Calibre 5 11-30-2010 06:35 PM
Need help with a conversion regex - can't match newline ereader123 Calibre 2 03-29-2010 10:58 AM


All times are GMT -4. The time now is 06:28 PM.


MobileRead.com is a privately owned, operated and funded community.