![]() |
#1 |
Member
![]() Posts: 19
Karma: 10
Join Date: Oct 2008
Device: Sony PRS-505
|
![]()
I'm having trouble getting the print versions of articles from the Orlando Sentinel. The problem is that they have completely different article numbers for the regular and print-friendly versions of a feature.
For instance: In this RSS feed: http://feeds.feedburner.com/orlandosentinel Regular version with the link provided in RSS: http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,2581414.story Print-friendly version (link is found on regular article's page): http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,95752,print.story The print-friendly version shows up like this in the regular version: Code:
<div><img src="/common/images/icons/atools-printer.gif" alt="Print" /><a href="/business/orl-existing-home-sales-orlando-100908,0,95752,print.story" rel="nofollow" >Print</a></div> I already tried this but I think it's just looking at the actual RSS feed instead of each article so it did not help. Code:
def print_version(self, url): soup = self.index_to_soup(url) for item in soup.findAll('a', attrs={'rel':'nofollow'}): strhref = item['href'] match = strhref.find('print.story') if match > -1: return strhref return None |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,207
Karma: 23446406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Code:
def print_version(self, url): for a in self.index_to_soup(url).findAll('a', href=re.compile(r'print\.story'): if 'Print' in a.string: return 'http://www.orlandosentinel.com' + a['href'] return url |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Mar 2010
Location: Oviedo, FL
Device: Kindle 2
|
Acey,
Were you able to get your recipe to work with the Orlando Sentinel? Gatorguy |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Merging two news sources in same recipe | cartesio | Calibre | 3 | 02-05-2012 04:05 PM |
Catholic News Recipe Problem | funkgut | Calibre | 4 | 04-23-2010 02:08 PM |
News recipe sorting | OzAz | Calibre | 3 | 10-30-2009 06:28 PM |
Question on TheAtlantic News Recipe | gilamon | Calibre | 6 | 11-05-2008 03:07 PM |
The Times news recipe? | AprilHare | Calibre | 1 | 10-10-2008 01:48 PM |