09-21-2010, 05:02 AM | #16 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Oh, that works? I feel stupid now- somehow, that never occured to me.
That's stretching my knowledge of theoretical computer sciences... I've heard lectures on program verification, algorithms and data structures, but that's about it. I'll have to think about what you wrote there. |
09-21-2010, 06:35 AM | #17 | |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Quote:
That, and all the flags are specific to python, and for the most part your tutorial has been about generic regex behavior. Probably best to stay generic, though that may be debatable... |
|
Advert | |
|
09-21-2010, 10:50 PM | #18 |
Groupie
Posts: 155
Karma: 112134
Join Date: May 2009
Location: Kuala Lumpur
Device: iPad, K3, K4, T1
|
This thread has been very enlightening. My thanks to OP and everybody who contributed further; I've learned a lot.
|
09-21-2010, 10:57 PM | #19 |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Reg expressions have always been a little complex for me to understand. Great tutorial by the way. One thing that I have found that works really well and I use it with a lot of the recipes i write for calibre is this site here.
http://www.txt2re.com/index.php3 you can type the string that you want to search for. then hit the language which in our case would be python and it spits out the code to find the regex. |
09-22-2010, 04:08 AM | #20 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Thanks to both of you.
As for that regexp generator, as far as I see, you'd have to manually paste the expression together from the single elements it generates in code, yes? But still, that might be a handy tool. |
Advert | |
|
09-23-2010, 08:31 AM | #21 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Edited again: for style, I tried to clarify distinction between use of parentheses and square brackets (groups vs. sets), added notes on strings in general, added some examples, included some flags.
Still not sure what to do about that part about special characters and escape sequences. Chaley, I generally agree that it's out of place where it is now, but I couldn't figure out where else to put it. I think I'm gonna go ahead and rewrite it into the respective portions, but that's for another pending edit. Also, still to come: even more examples. (Things I learned while writing this edit: Calibre likes file extensions in the test field for importing file expressions, and Notepad++ has an entirely different understanding for what \s means than Python does. Next time, I'm going to use Python to test examples right away.) |
09-23-2010, 08:44 AM | #22 |
US Navy, Retired
Posts: 9,865
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
|
09-23-2010, 09:13 AM | #23 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
I'm talking about using the search feature with regular expressions, which the programs help notes to use the Scintilla regexp engine and being fixed to a per-line match. Does changing the language change this behaviour?
Short answer: No, I didn't. |
09-23-2010, 09:24 AM | #24 | |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
(Sorry, I lapsed into some LaTeX there, but, seeing your background, I assume you understand.) |
|
09-23-2010, 10:42 AM | #25 | |
Enthusiast
Posts: 31
Karma: 12
Join Date: Mar 2010
Device: Kindle 2, Kindle 3
|
Quote:
http://www.pythonregex.com/ |
|
09-23-2010, 11:10 AM | #26 | ||||
Grand Sorcerer
Posts: 11,833
Karma: 7030035
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
First, some words on finite state machines (FSMs). An FSM consists of a set of states, connected by arcs that stand for events. Lets say that these events 'label' the arc. In the kind of machine we are talking about (deterministic), an event can appear on at most one arc out of a given state. Traversing an arc consumes the event. Events that are labels for an arc leaving a particular state cannot be accepted by that state, and their occurrence is an error. The machine is constructed in advance from the pattern, and cannot change depending upon input. Thus the regular expression 'abc' would look something like start __a__ state1 __b__ state2 __c__ accept The regular expression for 'ab*c' would look something like Code:
start __a__ state1 __b__ state2 __c__ accept
|_b_|
Quote:
Quote:
Quote:
Of course, you can write a regexp if you know *all* the utterances that you must match. For example, the regexp 'abba|abcba|abbbba' works fine. You might be interested in the paper at this link. It gets mathematical at times (lots of times), but it discusses how one translates REs used in searches to FSMs to really do the work. Last edited by chaley; 09-23-2010 at 11:29 AM. |
||||
09-23-2010, 11:12 AM | #27 | |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
|
|
09-23-2010, 11:27 AM | #28 | ||
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
After reading your post, I believe my difficulties lie in understanding FSMs. Reading the introduction to the relevant Wikipedia entry obviously doesn't give an understanding of the subject... I'm going to read up on this, as this is relevant to my (private) interests, but it's going to have to be on the backburner for some while due to some Real Life interference. Quote:
|
||
09-23-2010, 11:54 AM | #29 | ||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
As to the problem, I'm always up for a challenge, particularly one that tells me I "will fail." I'll quote the problem statement from the professor: Quote:
Code:
.* |
||
09-23-2010, 11:55 AM | #30 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Yes, you're right. Hey, that means that all those theories chaley posted about are wrong! Quick, let's publish a paper!
|
Tags |
regexp calibre tutorial |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem with regular expressions | Manichean | Conversion | 10 | 02-03-2011 02:27 PM |
Custom Regular Expressions for adding book information | bigbot3 | Calibre | 1 | 12-25-2010 06:28 PM |
Help with Regular Expressions | ghostyjack | Workshop | 2 | 01-08-2010 11:04 AM |
Regular Expressions help needed | Phil_C | Workshop | 20 | 10-03-2009 12:14 AM |
BookDesigner v5 and regular expressions | ShineOn | Sony Reader | 11 | 08-25-2008 04:06 PM |