02-13-2011, 09:06 PM | #1 |
Junior Member
Posts: 4
Karma: 748
Join Date: Jan 2011
Device: Kindle 3
|
Having trouble getting complete article for Reading Eagle
I apologize in advance if this has been discussed -- I couldn't find it.
Here is the RSS Feed: http://readingeagle.com/feeds/all/newsrss.xml I only get the first few lines of each article. Here is my recipe: Code:
class AdvancedUserRecipe1297542834(BasicNewsRecipe): title = u'Reading Eagle' use_embedded_content = True oldest_article = 7 max_articles_per_feed = 100 remove_javascript = True no_stylesheets = True remove_empty_feeds = True feeds = [ (u'local news', u'http://readingeagle.com/feeds/all/newsrss.xml'), ] |
11-25-2011, 09:36 AM | #2 | |
Member
Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Re: Trouble returning whole article
Quote:
def print_version(self, url): return self.browser.open_novisit(url).geturl() to no avail. How can I get my recipe to follow that Read More url? Is there a builtin recipe for another site that has the same problem that I could crib from? Maddeningly, there is a print version which downloads fine, but the url cannot be derived from the one for the non-print version because it uses a number unrelated to the original article title. |
|
Advert | |
|
11-25-2011, 09:53 AM | #3 |
creator of calibre
Posts: 44,391
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You want
use_embedded_content = False not True |
11-25-2011, 08:36 PM | #4 |
Member
Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Thanks much for your reply. Unfortunately, that just makes the TOC disappear, so now all I get is 'Start' and no content. Here is my recipe:
Code:
class AdvancedUserRecipe1322154189(BasicNewsRecipe): title = u'the Progressive' masthead_url = 'http://progressive.org/sites/all/themes/progress/logo.png' oldest_article = 7 feeds = [u'http://feeds.feedburner.com/progressivefeed'] def get_cover_url(self): soup = self.index_to_soup('http://progressive.org') item = soup.find('div',attrs={'class':'views-field-field-cover-fid'}) if item: return item.img['src'] return None David |
11-25-2011, 09:12 PM | #5 | |
Member
Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Quote:
Code:
class AdvancedUserRecipe1297542834(BasicNewsRecipe): title = u'Reading Eagle' oldest_article = 7 max_articles_per_feed = 100 remove_empty_feeds = True auto_cleanup = True feeds = [ (u'local news', u'http://readingeagle.com/feeds/all/newsrss.xml'), ] def print_version(self,url): return url + '#' |
|
Advert | |
|
11-25-2011, 10:14 PM | #6 |
Member
Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Fixed my recipe for The Progressive using code from the recipe for Alternet! Here it is, for anyone else who wants it (it doesn't get you the whole magazine, just a few articles and some web-only content):
Code:
from calibre.ptempfile import PersistentTemporaryFile class AdvancedUserRecipe1322154189(BasicNewsRecipe): title = u'the Progressive' masthead_url = 'http://progressive.org/sites/all/themes/progress/logo.png' oldest_article = 7 articles_are_obfuscated = True use_embedded_content = False auto_cleanup = True temp_files= [] feeds = [u'http://feeds.feedburner.com/progressivefeed'] def get_article_url(self, article): return article.get('link', None) def get_obfuscated_article(self, url): br = self.get_browser() br.open(url) response = br.follow_link(url_regex = r'/print/[0-9]+', nr = 0) html = response.read() self.temp_files.append(PersistentTemporaryFile('_fa.html')) self.temp_files[-1].write(html) self.temp_files[-1].close() return self.temp_files[-1].name def get_cover_url(self): soup = self.index_to_soup('http://progressive.org') item = soup.find('div',attrs={'class':'views-field-field-cover-fid'}) if item: return item.img['src'] return None |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Article criticizes speed reading | GA Russell | General Discussions | 18 | 01-17-2011 02:41 PM |
trouble reading a converted pdf to lrf with unpdf | tuvoc | Calibre | 1 | 06-20-2009 01:28 PM |
Opinions of reading The Stand 'The Complete & Uncut Version' | snipenekkid | Reading Recommendations | 39 | 06-17-2009 09:02 PM |
'El Pais' article (in Spanish) on cyber-reading | Patricia | News | 1 | 03-23-2008 07:04 AM |
NY Times article about e-books and reading business | SpiderMatt | News | 5 | 02-16-2008 09:55 PM |