09-29-2024, 04:26 PM | #1 |
Evangelist
Posts: 431
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
Regex function: title case
I want to change some all caps to title case. I tried the example from the Calibre manual, but it's not doing anything. https://manual.calibre-ebook.com/function_mode.html
Text: <h2>Part Seven: ARTHUR OF BRITAIN</h2> Find: <([Hh][1-6])[^>]*>.+?</\1> Mode: Regex-function Function: Title-case text (ignore tags) Then hit find, it finds my text, hit replace, it flashes, but nothing changes. I expected the text to become <h2>Part Seven: Arthur Of Britain</h2> ?? Last edited by foosion; 09-29-2024 at 04:30 PM. |
09-29-2024, 04:32 PM | #2 |
Wizard
Posts: 1,351
Karma: 6794938
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
I think there is a bug in Calibre when using these change case functions. The entire text block you want to change needs to have a uniform case. No mixed cases as in your example.
In your case, you will need to first change to Capitalise, then run Title-case. |
Advert | |
|
09-29-2024, 04:39 PM | #3 | |
Evangelist
Posts: 431
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
Quote:
I would have expected Calibre to just apply the python title() function, but apparently not. EDIT: Maybe the bug is in the manual. Last edited by foosion; 09-29-2024 at 05:23 PM. |
|
09-29-2024, 05:26 PM | #4 | |
Grand Sorcerer
Posts: 12,017
Karma: 7257323
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Here is the source for calibre's titlecase() function. Code:
def titlecase(text): """ Titlecases input text This filter changes all words to Title Caps, and attempts to be clever about *un*capitalizing SMALL words like a/an/the in the input. The list of "SMALL words" which are not capped comes from the New York Times Manual of Style, plus 'vs' and 'v'. """ all_caps = icu_upper(text) == text pat = re.compile(r'(\s+)') line = [] for word in pat.split(text): if not word: continue if pat.match(word) is not None: line.append(word) continue if all_caps: if UC_INITIALS.match(word): line.append(word) continue else: word = icu_lower(word) if APOS_SECOND.match(word): word = word.replace(word[0], icu_upper(word[0]), 1) word = word[:2] + icu_upper(word[2]) + word[3:] line.append(word) continue if INLINE_PERIOD.search(word) or UC_ELSEWHERE.match(word): line.append(word) continue if SMALL_WORDS.match(word): line.append(icu_lower(word)) continue hyphenated = [] for item in word.split('-'): hyphenated.append(CAPFIRST.sub(lambda m: icu_upper(m.group(0)), item)) line.append("-".join(hyphenated)) result = "".join(line) result = SMALL_FIRST.sub(lambda m: '{}{}'.format( m.group(1), capitalize(m.group(2)) ), result) result = SMALL_AFTER_NUM.sub(lambda m: '{}{}'.format(m.group(1), capitalize(m.group(2)) ), result) result = SMALL_LAST.sub(lambda m: capitalize(m.group(0)), result) result = SUBPHRASE.sub(lambda m: '{}{}'.format( m.group(1), capitalize(m.group(2)) ), result) return result |
|
09-29-2024, 05:30 PM | #5 | |
Wizard
Posts: 1,351
Karma: 6794938
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
It is a simple workaround anyway, so never bothered to query it. Being aware of it is enough. |
|
Advert | |
|
09-29-2024, 05:40 PM | #6 |
Evangelist
Posts: 431
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
In that case, consider changing the manual, since the example it gives is "<h1>some TITLE</h1> to <h1>Some Title</h1>".
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
RegEx Function: Title Case | phossler | Editor | 29 | 07-04-2020 10:52 AM |
RegEx or RE Function to apply [Change Case] Capitialize? | phossler | Editor | 20 | 05-03-2016 07:53 PM |
S/R Function Title-Case | phossler | Editor | 6 | 02-01-2015 02:47 PM |
Regex for Title Case or Sentence case? | Turtle91 | Sigil | 3 | 01-19-2013 01:36 PM |
Dutch title case function | fvdham | Library Management | 8 | 10-11-2012 10:09 PM |