11-23-2015, 09:57 PM | #1 |
Fanatic
Posts: 526
Karma: 32158
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
Splitting multiple html files?
Hi you guys,
I know that Editor can split a single html files using xpath. It is great. But I wonder if there is a way to split all the html files at the same time (something like "split mark" in Sigil). Before I saved all the footnotes at the end of the respective htmls, now I want to merge them into a single endnote file. I have to move to every html and split and merge... Ah, I used file_name in Regex Function and it returns the whole html path (I can use regex to strip off the unwanted part) but is there a way to get only the name, not the extension? (I use it for note IDs) |
11-24-2015, 03:19 AM | #2 |
Interested in the matter
Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
I also use file_name (full) for note IDs. But, because you need to remove the extension?
To extract notes from all files and dump them in a specific file (notas.xhtml), as I have not sufficient knowledge of Python, I do the following: 1- I make notas.xhtml 2- I use this regex-function Code:
#Searching: (<p class="nota".+?>.+?</p>) def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): notas = open('e:/Libros/Taller/En curso/notas.txt', 'a') texto = match.group()+'\n' notas.write(texto) return '' replace.file_order = 'spine' And sorry for my english. |
Advert | |
|
11-24-2015, 03:23 AM | #3 |
Fanatic
Posts: 526
Karma: 32158
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
Thank you. I will try to play with that. I'm no programmer, though.
What I did is search for "#n(\d+)" (in #n1, for example) Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): text='#' text2= '_' return text + file_name + text2 + match.group(1) #OEBPS/1.html_1 I would only want: #1_1 (of course, i could use regex to clean the unwanted portion afterward, but I would be nicer to have it done in one regex function, and I could learn something as well) Last edited by nqk; 11-24-2015 at 03:32 AM. |
11-25-2015, 03:19 AM | #4 |
Interested in the matter
Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
You can use:
file_name = file_name [6:len(file_name)-5] |
11-26-2015, 09:38 PM | #5 |
Fanatic
Posts: 526
Karma: 32158
Join Date: Feb 2012
Device: Onyx Boox Leaf
|
|
Advert | |
|
11-27-2015, 03:08 AM | #6 |
Interested in the matter
Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
You are welcome.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Merging multiple HTML files into one HTML file | skoobwoman | Workshop | 45 | 07-11-2014 11:46 AM |
splitting html files? | NASCARaddicted | ePub | 8 | 01-22-2013 05:13 AM |
How To Stop It From Splitting HTML Files? | Ransom | Calibre | 8 | 06-12-2011 03:08 PM |
Does splitting EPUB among more HTML files improve Performance? | purcelljf | ePub | 2 | 10-01-2010 02:15 AM |
Splitting the Bible into Multiple Files | SciFiGal777 | Ectaco jetBook | 3 | 03-27-2010 10:35 PM |