09-03-2012, 07:27 AM | #1 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
|
Recipe without rss feed?
Hi!
There is a webpage (URL) that gets updated daily. There is no rss feed. Basically I would like to write a recipe that grabs the contents of that single URL and formats it into an epub for me once a day. My first approach, to just add the URL to the feeds variable gives me an epub with all the contents of the URL including its html tags - with no formatting at all. I flipped through the API documentation but only found the 'use_embedded_content' variable, which might be the correct direction? However I feel being trapped by having the URL content interpreted as rss and not as "news content". Any clue how to process webpages without the help of rss feeds? Thank you! |
09-04-2012, 09:50 AM | #2 |
Groupie
Posts: 169
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
|
So, you mean you want to download ONLY the starting page? (against using the page as TOC)
|
Advert | |
|
09-04-2012, 09:26 PM | #3 |
Guru
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
You should be able to do this with instapaper. Just create accaount and add that url to your main feed. The existing recipe should do the rest.
|
09-05-2012, 02:12 AM | #4 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
|
Yes, that is the idea.
The single webpage contains all the contents (but no toc and neither a rss data). |
09-05-2012, 02:15 AM | #5 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
|
Thank you kiklop74,
Using Instapaper works and brings the page as an epub to my reader. However starting to learn how calibre recipes look like, I would like to go on and try to figure out how I could do processing of the page by myself - that way I might be able to insert special formatting or sectioning, maybe even a table of contents. |
Advert | |
|
09-05-2012, 02:29 AM | #6 |
Junior Member
Posts: 3
Karma: 497132
Join Date: Sep 2012
Device: ipad
|
|
09-05-2012, 01:07 PM | #7 |
Addict
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
|
feed43 (feed for free) is pretty good.
|
09-07-2012, 12:38 AM | #8 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
|
Idea for a direct methof of accessing an static URL?
Okay, thanks - the idea to create a rss feed by a service or even locally on my computer might work.
However, there must be a way to do it directly wih the recipes? I did not have much time to dig into this forum. But maybe the BeautifulSoup might be something? Or is this used for handling after a rss feed helped to get to some content page? |
09-07-2012, 12:45 AM | #9 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
|
The main issue I have with Instapaper is, that I have to visit the page manually each time I would like to get the actual copy. Without this step, I receive with every download the same content. So I guess feed43 is the better work-around for my problem. Will give feedback, after I put it to work.
|
09-07-2012, 06:27 AM | #10 |
Junior Member
Posts: 4
Karma: 10
Join Date: Sep 2012
Device: sony ereader
|
Nicolash I was trying to do the same thing the other day. You can do what your asking using a basicnewsrecipe class and the parse_index function. The following should get you started
Spoiler:
Basically all you need to do is inherit a BasicNewsRecipe class and then initialise a feeds variable with the correct paramaters, calibre does the rest |
09-09-2012, 06:35 AM | #11 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
|
Thanks, eroche!
Your code snippet really gets me started. Almost everything I had to do to obtain the page as epub with proper formatting was to exchange the URL! Will start playing around with this some more. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
RSS FEED/ RECIPE for 365tomorrows.com | earl412 | Recipes | 9 | 06-29-2012 01:55 PM |
Request: small recipe that adds borders to a borderless table inside an RSS feed | mopol | Recipes | 0 | 03-01-2012 03:26 PM |
Recipe for german RSS feed "Leipziger Volkszeitung" | a.peter | Recipes | 0 | 09-28-2011 03:05 AM |
RECIPE Request: MLB.COM RSS Feed | fung | Recipes | 0 | 03-26-2011 11:42 PM |
RSS Feed | timezone | Feedback | 8 | 01-02-2010 06:55 PM |