07-02-2020, 10:10 PM | #31 |
creator of calibre
Posts: 44,483
Karma: 24495778
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That's because builtin function are not simple standalone functions you can learn from, they use other calibre code, however if you want to see them look in function_replace.py in the calibre source code.
|
12-01-2020, 07:10 PM | #32 |
Enthusiast
Posts: 26
Karma: 38
Join Date: Nov 2019
Location: Paris, France
Device: none
|
Automatically fill in the <title> tag of text pages
In the <head> section, the absence of a <title> tag causes an epubcheck error. It also happens to find something like (depends on the language):
<title>Unknown</title> or <title></title> In these cases, the regex-function will look for the title in the metadata to fill in the <title> tag of the <head> sections of the xhtml pages. If the title in the metadata is not filled in or itself has the default value “Unknown”, the function leaves it as is. You can then fill in the <dc: title> tag in the opf, save the epub, re-open it in the editor and then restart the regex-function. The function is commented out. You must adapt the regex and the function to the language of the epub if it is not English or French to add the equivalent word to “Unknown”. The regex : Code:
<title>(?:[Ii]nconnu\(e\)|[Uu]nknown)?</title>|<head>(?:(?!<title).)+\K(</head>) The function : Code:
# execute the function with this regex : # <title>(?:[Ii]nconnu\(e\)|[Uu]nknown)?</title>|<head>(?:(?!<title).)+\K(</head>) # Dot matches all def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): # Funct-regex to fill in (with <dc:title> of the opf) a <title> tag in xml files # if there is no such tag or if it's 'Unknown', or localized equivalent # This tuple and the regex should be adapted to the language of the epub # Add in this tuple the string to target, in your language # +++ Must be in lower case +++, since the string in the test is lowered no_title = ('unknown', 'inconnu(e)') # 'is_dc_title' is true if metada.title is defined # Warning : if no <dc:title> in the opf, metada.title will take # the value 'Unknown' or its localized value (ex : Inconnu(e) for french) is_dc_title = ( metadata.title is not None \ and len(metadata.title) > 0 \ and metadata.title.lower() not in no_title ) # no capturing group : <title> is empty or 'Unknown' # (we capture a group only if we reach </head> without finding <title>) if not match.group(1): if is_dc_title: title = " <title>" + metadata.title + "</title>" else: title = match.group() # found (</head>), thus <title> tag is missing else: if is_dc_title: title = " <title>" + metadata.title + "</title>" + '\n' + match.group(1) else: title = match.group(1) ######## Shall we fill in a tag if none ? ########### # comment/uncomment this line below if you want to write <title></title> # in case tag <title> is missing and <dc:title> is not defined # if commented, tag will be still missing # title = " <title></title>\n" + title return title Last edited by EbookMakers; 12-08-2020 at 04:31 AM. |
Advert | |
|
09-22-2021, 06:11 PM | #33 |
Nameless Being
|
|
10-17-2021, 05:19 PM | #34 |
Junior Member
Posts: 9
Karma: 10
Join Date: Oct 2021
Device: Kindle Voyager
|
Hi there!
I have code like that: Code:
<p class="text">some text</p> <blockquote class="email"> <p class="text">some <i>text</i></p> <p class="text"><b>some</b> text</p> <p class="text">some text</p> </blockquote> <p class="text">some text</p> Regex search string, like "<blockquote class="email">(.*?)<p class="text">(.*?)</blockquote>" work only if blockquote have only one p tag. I don't understand, how to create correct search string or even how to ask google for it. Hope for your help. Thanks. |
10-17-2021, 06:09 PM | #35 | |
Not Quite Dead
Posts: 195
Karma: 654170
Join Date: Jul 2015
Device: Paperwhite 4; Galaxy Tab
|
@firsikov:
Using regex can be difficult and tricky at times. Consider another way to accomplish the same thing, without changing the HTML code. What you seem to want is to control the look of text within a particular type of blockquote. Consider using CSS contextual styles: Quote:
|
|
Advert | |
|
10-18-2021, 05:21 AM | #36 | |
Junior Member
Posts: 9
Karma: 10
Join Date: Oct 2021
Device: Kindle Voyager
|
Quote:
|
|
10-18-2021, 06:02 AM | #37 | |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
Code:
<blockquote class="email">\1<p class="newclassname">\2</blockquote> Code:
(<blockquote class="email">.*?<p class=")text(">.*?</blockquote>) Code:
\1newclassname\2 For both, you have to run them multiple times. If you do "Replace all", it only makes one change in each blockquote. |
|
10-18-2021, 06:44 AM | #38 | |
Zealot
Posts: 145
Karma: 1451628
Join Date: Jul 2021
Device: N/A
|
Quote:
Select the mode "regex-function" Your "find" field is : (<blockquote class="email">)(.*?)(</blockquote>) Create the regex-function with this code: Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): inside_block = match.group(2).replace( '<p class="text"', '<p class="newtext"') return match.group(1) + inside_block + match.group(3) Be careful, there is a simple quote after the double quote in "text"' and newtext"', it is mandatory to keep it. Then, you can go to "replace all" |
|
10-18-2021, 08:30 AM | #39 | |
Junior Member
Posts: 9
Karma: 10
Join Date: Oct 2021
Device: Kindle Voyager
|
Quote:
Code:
<p class="text">some text</p> <blockquote class="email"> <p class="text">some <i>text</i></p> <p class="text"><b>some</b> text</p> <p class="text">some text</p> </blockquote> <p class="text">some text</p> <p class="text">some text</p> <blockquote class="email"> <p class="text">some <i>text</i></p> <p class="text"><b>some</b> text</p> <p class="text">some text</p> </blockquote> <p class="text">some text</p> Code:
<p class="text">some text</p> <blockquote class="email"> <p class="newclass>some <i>text</i></p> <p class="newclass><b>some</b> text</p> <p class="newclass>some text</p> </blockquote> <p class="newclass>some text</p> <p class="newclass>some text</p> <blockquote class="email"> <p class="newclass>some <i>text</i></p> <p class="newclass><b>some</b> text</p> <p class="newclass>some text</p> </blockquote> <p class="text">some text</p> |
|
10-18-2021, 09:32 AM | #40 | |
Grand Sorcerer
Posts: 24,905
Karma: 47303822
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
|
|
10-18-2021, 11:58 AM | #41 |
Zealot
Posts: 145
Karma: 1451628
Join Date: Jul 2021
Device: N/A
|
Just a question: why don't you want to use my solution? It works like a charm on the example you gave, in one shot on the whole book (or on the page, if you chose "current file")
Screenshot of the diff-screen: https://i.imgur.com/v5oxiFy.jpeg |
10-18-2021, 12:24 PM | #42 | |
Junior Member
Posts: 9
Karma: 10
Join Date: Oct 2021
Device: Kindle Voyager
|
Quote:
But now, when I wanted to take a screenshot to show you, everything worked as i need! Anyway, thanks for the advice. |
|
10-18-2021, 01:02 PM | #43 | |
Zealot
Posts: 145
Karma: 1451628
Join Date: Jul 2021
Device: N/A
|
Quote:
My mistake if it's the case, I should have told you to check it, but as Davidfor already said it, I thought it was unnecessary to repeat it. Anyway, it's fine if your problem is solved :-). |
|
10-19-2021, 12:00 PM | #44 | |
Junior Member
Posts: 9
Karma: 10
Join Date: Oct 2021
Device: Kindle Voyager
|
Quote:
|
|
10-19-2021, 03:48 PM | #45 | |
Grand Sorcerer
Posts: 5,640
Karma: 23191067
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
|
|
Tags |
conversion, errors, function, ocr, spelling |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
About saved searches and regex | Carpatos | Editor | 22 | 09-30-2020 10:56 PM |
Regex-Functions - getting user input | CalibUser | Editor | 8 | 09-09-2020 04:26 AM |
Difference in Manual Search and Saved Search | phossler | Editor | 4 | 10-04-2015 12:17 PM |
Help - Learning to use Regex Functions | weberr | Editor | 1 | 06-13-2015 01:59 AM |
Limit on length of saved regex? | ElMiko | Sigil | 0 | 06-30-2013 03:32 PM |