Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-18-2012, 12:36 PM   #1
hiperlink
Enthusiast
hiperlink began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
Unhappy Replacing item with while using auto_cleanup = True

Hi All,

I'm developing a new recipe for a subsription required hungarian website, and I'm in an almost final stage (generated feed from the index, fetching articles is OK).

I'm using auto_cleanup = True to create readable articles which work rather well and I'm happy with the output.

My only remaining issue is, that although I had set up some regex based removal like this:

Spoiler:
Code:
preprocess_regexps = [ (re.compile(r'<!--.*?-->', re.DOTALL), lambda m: ''),
                           (re.compile(r'<p align="left"'), lambda m: '<p'),
                           (re.compile(r'<a href="/"><img src="images/logo.jpg".*?/></a>'), lambda m: ''),
                           (re.compile(r'<a href="/"><img src="images/logo.jpg".*?/></a>'), lambda m: ''),
                           (re.compile(r'<a href="javascript:changeFontSize.*?/></a>', re.DOTALL), lambda m: ''),
                           (re.compile(r'\| ÉLET ÉS IRODALOM</title>'), lambda m: '</title>')
                         ]


It looks like it does not replaces (especially the last line) anything and I don't know why.

It's important as I had noticed the articles title cames from the page's <title> tags. And for some reason the original <title> tags on the article's page contains that unnecessary uppercase text (with a | in front of it). Can someone give me a hint how to remove that?
hiperlink is offline   Reply With Quote
Old 01-18-2012, 12:45 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,557
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you use auto_cleanup preprocess_regexps are not used. Use postprocess_html instead.
kovidgoyal is online now   Reply With Quote
Advert
Reply

Tags
auto_cleanup, preprocess_regexp, replace


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Ended Item has been sold aspen806 Flea Market 1 12-02-2011 11:52 PM
Replacing code without replacing text? ElMiko Sigil 6 11-30-2011 09:14 PM
Ended Item Sold aspen806 Flea Market 1 11-30-2011 06:59 AM
using auto_cleanup and manual clean up together scissors Recipes 5 11-06-2011 10:13 PM
Ended Item sold... jswinden Flea Market 0 09-09-2009 05:16 PM


All times are GMT -4. The time now is 12:21 AM.


MobileRead.com is a privately owned, operated and funded community.