05-01-2010, 11:51 AM | #1876 |
Junior Member
Posts: 8
Karma: 10
Join Date: Apr 2010
Device: Kindle 3
|
Okay, so I tried to make an AVClub website recipe by customising the bbc one. It seems to work fine, but I need some help removing all the extra stuff - headers, sidebar, images etc. This is what the recipe looks like at the moment:
Code:
#!/usr/bin/env python __license__ = 'GPL v3' __copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>' ''' bbc.co.uk ''' from calibre.web.feeds.news import BasicNewsRecipe class BBC(BasicNewsRecipe): title = u'The Onion AV Club' __author__ = 'Kovid Goyal' description = 'Film, Television and Music Reviews' oldest_article = 2 max_articles_per_feed = 100 no_stylesheets = True use_embedded_content = False encoding = 'utf-8' remove_javascript = True remove_tags = [dict(name='div', attrs={'class':'footer'})] extra_css = '.headline {font-size: x-large;} \n .fact { padding-top: 10pt }' feeds = [ ('Interviews', 'http://www.avclub.com/feed/interview/'), ('Features', 'http://www.avclub.com/feed/features/'), ('Film', 'http://www.avclub.com/feed/film/'), ('Music', 'http://www.avclub.com/feed/music/'), ('DVD', 'http://www.avclub.com/feed/dvd/'), ('Books', 'http://www.avclub.com/feed/books/'), ('Games', 'http://www.avclub.com/feed/games/'), ('AV Club Daily', 'http://www.avclub.com/feed/daily/'), ] |
05-01-2010, 04:16 PM | #1877 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Code:
keep_only_tags = [dict(name='div', attrs={'id':'content'}) ] remove_tags = [dict(name='div', attrs={'class':['footer','tools_horizontal']}), dict(name='div', attrs={'id':['tool_holder','elsewhere_on_avclub']}) ] As an aside, when someone has tried to make the recipe, and posts the recipe with feeds, etc., it makes it easier to help. Further, I'm more inclined to try to help if they've done as much as they can. In this case, as in many cases, all that was needed was to run Firefox on the article, then use Firebug to identify the class, div, id, etc. for elements that should be kept or removed. |
|
Advert | |
|
05-01-2010, 04:29 PM | #1878 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
05-01-2010, 04:38 PM | #1879 |
Junior Member
Posts: 8
Karma: 10
Join Date: Apr 2010
Device: Kindle 3
|
Thanks a lot for the help - that worked perfectly. I'm a complete novice but I took at look at firebug and it seems easy enough for me to use in the future.
For others that want it, here's the Onion AV Club recipe: onionavclub.zip |
05-01-2010, 05:31 PM | #1880 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Just right click on an element you want to remove (while viewing it in Firefox at the site) and select "Inspect Element." You can then figure out how to remove it from the popup that appears.
|
Advert | |
|
05-02-2010, 03:31 AM | #1881 |
Connoisseur
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
|
New Recipe for
Il Messaggero Italian Daily News Paper |
05-02-2010, 04:46 AM | #1882 |
Connoisseur
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
|
|
05-03-2010, 02:02 AM | #1883 |
Connoisseur
Posts: 53
Karma: 496648
Join Date: May 2010
Device: Sony PRS-600
|
Dear Forumites
Can anyone please help me create a custom recipe for an English language Hong Kong newspaper. It does need a username and password. http://www.scmp.com/portal/site/SCMP...ervices&ss=RSS If you could please get me started with a template I would be absolutely indebted to you.... |
05-03-2010, 03:11 PM | #1884 |
Junior Member
Posts: 8
Karma: 10
Join Date: May 2010
Device: Bebook One (Hanlin v3)
|
Swedish news
EDIT: Double post mistake
Last edited by Tumaini; 05-04-2010 at 08:25 PM. |
05-03-2010, 03:21 PM | #1885 |
Junior Member
Posts: 8
Karma: 10
Join Date: May 2010
Device: Bebook One (Hanlin v3)
|
Here are recipes for two Swedish news networks:
Ekot (NOTE - Ekot changed their format so this script probably won't work): Code:
class Ekot_SE(BasicNewsRecipe): title = 'Ekot' __author__ = 'Joakim Lindskog' description = 'Nyheter från Ekot' publisher = 'Ekot' category = 'news, politics, Sweden' oldest_article = 7 delay = 1 max_articles_per_feed = 100 no_stylesheets = True use_embedded_content = False encoding = 'utf-8' language = 'sv' conversion_options = { 'comment' : description , 'tags' : category , 'publisher' : publisher , 'language' : language } keep_only_tags = [dict(name='h1', attrs={'class':'newsH2'}), dict(name='div', attrs={'class':'articleTop'}), dict(name='div', attrs={'class':'newsIntro'}), dict(name='div', attrs={'class':'newsText'})] remove_tags = [ dict(name=['object','link','base']) ,dict(name='span',attrs={'class':'relLink'}) ] feeds = [(u'Ekot', u'http://api.sr.se/api/rssfeed/rssfeed.aspx?rssfeed=83'), (u'Utrikes', u'http://api.sr.se/api/rssfeed/rssfeed.aspx?rssfeed=3304'), (u'Radiosporten', u'http://api.sr.se/api/rssfeed/rssfeed.aspx?rssfeed=179')] def print_version(self, url): return url.replace('http://sverigesradio.se/cgi-bin/ekot/artikel.asp', 'http://sverigesradio.se/cgi-bin/isidorpub/PrinterFriendlyArticle.asp')+'&ProgramID=83' Code:
class FriaTidningen_SE(BasicNewsRecipe): title = u'Fria Tidningen' __author__ = 'Joakim Lindskog' description = 'Nyheter från Fria Tidningen' publisher = 'Fria Tidningen' category = 'news, politics, Sweden' oldest_article = 7 delay = 1 max_articles_per_feed = 100 no_stylesheets = True use_embedded_content = False encoding = 'utf-8' language = 'sv' conversion_options = { 'comment' : description , 'tags' : category , 'publisher' : publisher , 'language' : language } keep_only_tags = [dict(name='div', attrs={'id':'content-area'})] remove_tags_before = dict(name='div', attrs={'id':'content-area'}) remove_tags_after = dict(name='div',attrs={'id':'byline'}) remove_tags = [ dict(name=['object','link','base']), dict(name='div', attrs={'id':'comments'}), dict(name='div', attrs={'id':'block-block-21'}), dict(name='div', attrs={'id':'block-block-22'}), dict(name='div', attrs={'id':'block-block-23'}), dict(name='div', attrs={'id':'block-block-24'}), dict(name='div', attrs={'id':'block-block-25'}), dict(name='div', attrs={'id':'block-block-26'}), dict(name='div', attrs={'id':'block-block-27'}), dict(name='div', attrs={'id':'block-block-28'}), dict(name='div', attrs={'id':'block-block-29'}), dict(name='div', attrs={'id':'block-block-30'}), dict(name='div', attrs={'id':'block-block-40'}) ] feeds = [(u'Allt', u'http://www.fria.nu/feed'), (u'Nyheter', u'http://www.fria.nu/taxonomy/term/13/feed/feed'), (u'Inrikes', u'http://www.fria.nu/taxonomy/term/14/0/feed'), (u'Utrikes', u'http://www.fria.nu/taxonomy/term/15/0/feed'), (u'Ekonomi', u'http://www.fria.nu/taxonomy/term/27047/0/feed'), (u'Opinion', u'http://www.fria.nu/taxonomy/term/22/0/feed'), (u'Inledaren', u'http://www.fria.nu/taxonomy/term/24/0/feed'), (u'Argument', u'http://www.fria.nu/taxonomy/term/23/0/feed'), (u'Synpunkten', u'http://www.fria.nu/taxonomy/term/26/0/feed'), (u'Debatt', u'http://www.fria.nu/taxonomy/term/25/0/feed'), (u'Kultur', u'http://www.fria.nu/taxonomy/term/19/0/feed'), (u'Kulturnyheter', u'http://www.fria.nu/taxonomy/term/24534/0/feed'), (u'Recensioner', u'http://www.fria.nu/taxonomy/term/24535/0/feed'), (u'BAK', u'http://www.fria.nu/taxonomy/term/27/0/feed'), (u'Sport & Hälsa' u'http://www.fria.nu/taxonomy/term/27215/0/feed'), (u'Sport', u'http://www.fria.nu/taxonomy/term/20/0/feed'), (u'Hälsa', u'http://www.fria.nu/taxonomy/term/21/0/feed'), (u'Fördjupning', u'http://www.fria.nu/taxonomy/term/24994/0/feed'), (u'Fokus', u'http://www.fria.nu/taxonomy/term/24864/0/feed'), (u'Samtal', u'http://www.fria.nu/taxonomy/term/28/0/feed'), (u'Stockholm', u'http://www.fria.nu/taxonomy/term/122/0/feed'), (u'Göteborg', u'http://www.fria.nu/taxonomy/term/73/0/feed'), (u'Uppsala', u'http://www.fria.nu/taxonomy/term/27324/0/feed'), (u'Malmö', u'http://www.fria.nu/taxonomy/term/28031/0/feed')] Last edited by Tumaini; 05-05-2010 at 09:25 AM. |
05-04-2010, 01:07 PM | #1886 |
Guru
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
New recipe for The Indian Express in english:
|
05-04-2010, 09:09 PM | #1887 |
Connoisseur
Posts: 53
Karma: 496648
Join Date: May 2010
Device: Sony PRS-600
|
Hi Kiklop
I notice you have been really helpful to many fellow readers. Would you be so kind as to start me off with a recipe for the South China Morning Post? www.scmp.com http://www.scmp.com/portal/site/SCMP...ervices&ss=RSS It does require a password and username. Many thanks WL |
05-05-2010, 08:08 AM | #1889 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
It is very hard to write one without the password and username. You will probably have to give someone your password and username, or try writing it yourself. It's often not that hard, as you can follow another recipe that works, then tweak.
|
05-05-2010, 08:12 AM | #1890 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
You're welcome. I've read all those pages multiple times. Each time I have a problem, I go back to them. Feel free to come back here for help. This thread is sort of a mixture of people who don't feel they can do it themselves, and those who want to get their hands dirty and tackle the recipe, but need some guidance. Kiklop is a true expert, but for simpler problems, there are others here who can help.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |