08-29-2007, 08:34 PM | #211 |
Addict
Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
|
09-02-2007, 06:42 PM | #212 | |
Junior Member
Posts: 9
Karma: 10
Join Date: Aug 2007
Device: Sony Reader
|
Quote:
Best -ds |
|
Advert | |
|
09-02-2007, 08:00 PM | #213 |
Resident Curmudgeon
Posts: 76,465
Karma: 136564696
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
McAffee is giving you a false positive. Either update McAffee or find a virus scanner that actually works. Or you could always turn it off, get your RSS feed sorted and then turn it back on.
|
09-05-2007, 02:14 PM | #214 |
Junior Member
Posts: 5
Karma: 10
Join Date: Aug 2007
Device: sony ereader
|
when i do that the program crashes. everything i do the program crashes wtf.
i followed the d/l instruction. I'm not that a newbie when it comes to computer. I guess i'm out of luck then oh well i will stick with book design. when all of the bugs is fixed i will give this program another shot. Last edited by guardianx; 09-05-2007 at 02:22 PM. |
09-06-2007, 06:30 PM | #215 |
Junior Member
Posts: 7
Karma: 10
Join Date: Jun 2007
Device: Sony Reader
|
Anyone else having problem with the subscribe or publish functions? I'm using version 24 and it hangs everytime i invoke either.
|
Advert | |
|
09-06-2007, 07:30 PM | #216 |
Groupie
Posts: 155
Karma: 1044459
Join Date: Jul 2007
Device: prs-500
|
This tool definitely has its merits and I used it for a long time.
Honestly, after feedbooks.com showed up with their newspaper feature and their synchronization tool for my Sony Reader, I can now dock my reader and load up all my RSS feeds from feedbooks in seconds. I still do appreciate RSS2book for introducing me to properly formatted PDF RSS feeds and for those stubborn websites with limited RSS feeds. F. |
09-07-2007, 07:35 PM | #217 |
Addict
Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
The problem is my DSL speed. Verizon cannot upgrade me as I am on frame relay and there is no ATM or FIOS available in my area. I will look into other solutions for hosting this.
|
09-09-2007, 06:48 AM | #218 |
Junior Member
Posts: 1
Karma: 10
Join Date: Sep 2007
Device: SPH-A580
|
Hi! Great program.
Is it possible to generate SEPARATE pdfs for each story in a feed? I'm trying to create an archive of stories from a particular site, and I'd rather have separate pdfs than one giant one with a months worth of stories. HTMLDoc doesn't seem to natively have this feature either. Maybe I'd have to recursively run your program for each link? Thanks! |
09-11-2007, 06:23 PM | #219 |
Junior Member
Posts: 4
Karma: 10
Join Date: Jul 2007
Device: Sony Reader
|
If someone can help me understand how I would pull content from the following website (using the "Web Page Tab of rss2book) it will go a long way to me understanding not only how this program works, but also the REGEX expressions rqd to get at the content (and only the content) we are all using this program for :
"http://www.timesonline.co.uk/tol/comment/columnists/jeremy_clarkson/" There are a number of links on the page that reference the various blog entries I want to pull, but when I change rss2book settings for "followlinks" to depth 2 (or more) I get this error "Processing clarkson System.UriFormatException: Invalid URI: The URI scheme is not valid. at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind) at System.Uri..ctor(String uriString) at web2book.Utils.ExtractContent(String contentExtractor, String contentFormatter, String url, String html, String linkProcessor, Int32 depth, StringBuilder log) at web2book.Utils.GetContent(String link, String html, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log) at web2book.Utils.GetHtml(String url, Int32 numberOfDays, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log) at web2book.WebPage.GetHtml(ISource mySourceGroup, Int32 displayWidth, Int32 displayHeight, Int32 displayDepth, StringBuilder log) at web2book.MainForm.AddSource(ContentSourceList sourceClass, ContentSource source, Boolean isAutoUpdate)" IF I leave it set at 1 I get "Processing clarkson Final content: =================== <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta http-equiv="Content-Type" content="text/html;charset=utf-8" /><meta name="ROBOTS" content="NOARCHIVE" /><script type="text/javascript"> // Variables required for DART. MUST BE IN THE HEAD. var time = new Date(); randnum = (time.getTime()); </script><!-- Code to display title of the HTML page --><title> Jeremy Clarkson Columns & Comment | Times Online </title><meta name="Description" content="The UKs favourite motoring journalist comments on British society and culture in his weekly columns on Times Online"><link rel="shortcut icon" type="image/x-icon" href="/tol//img/favicon.ico" type="image/x-icon" /><link rel="stylesheet" type="text/css" href="/tol/css/alternate.css" title="Alternate Style Sheet" /><link rel="stylesheet" type="text/css" href="/tol/css/tol.css"/> <link rel="stylesheet" type="text/css" href="/tol/css/ie.css"/><link rel="stylesheet" type="text/css" href="/tol/css/typography.css"/><script language="javascript" type="text/javascript" src="/tol/js/tol.js"></script></head><body><div id="top"/><div id="shell"><div id="page"><!-- START REVENUE SCIENCE PIXELLING CODE --><script language="javascript" type="text/javascript" src="/tol/js/DM_client.js"></script><script language="javascript" type="text/javascript"> DM_addToLoc("Network",escape("Times")); DM_addToLoc("SiteName",escape("Times Online")); </script><script language="javascript" type="text/javascript"> // Index page for Revenue sciences" ....there's loads more, this is just part of the content. The point is, I thought that changing the "Follow links to Depth" setting to 2 would grab not only the page referred to in the URL, but also follow the links from that URL's page? I would then need to work on what REGEX would be needed to tidy up the resulting mass of content. (That would be problem / lesson 2, but one thing at a time!) Am I missing something? (I realise there is a RSS feed page where I can pull the current top 4 or 5 blog entries and adinb has helped me clean this up to be readable, what I want to understand is how do I manipulate Webpages) (Thank-you again to adinb who has been helping me with this problem using the rss feed and the "Feed" tab of rss2book, via PM, it's people like him that keep these types of forums useful...I thought it may be useful for others to understand how it all works and to lighten the load on adinb!) Thank-you all in advance. |
09-13-2007, 12:44 AM | #220 |
Books and more books
Posts: 917
Karma: 69499
Join Date: Mar 2006
Location: White Plains, NY, USA
Device: Nook Color, Itouch, Nokia770, Sony 650, Sony 700(dead), Ebk(given)
|
Error "Thu, 13 Sep 2007 06:31:55 EEST is out of range"
Hi,
I tried to use Rss2book to pull down some newspapers feeds, one worked nicely after I figured out a good regex to get just the text, but for the other whatever I try I get the following message repeated as many times as the #feeds and of course with the appropriate time/date I try (US Eastern +7 hrs - so I tried at 11.31 pm US Eastern, I get exactly the following, I try seven minutes later I get the message with 06.38...): Processing Evenimentul Thu, 13 Sep 2007 06:31:55 EEST is out of range Thu, 13 Sep 2007 06:31:55 EEST is out of range .... Is there anything I can do about it? The feed link is not in English, but the same was true for the other newspaper that works just fine: http://www.evz.ro/rss.php/evz.xml |
09-13-2007, 04:07 PM | #221 |
Junior Member
Posts: 7
Karma: 10
Join Date: Jun 2007
Device: Sony Reader
|
Okay. I'm must be losing my mind.
I've been able to extract the The New Yorker with the following setup: URL: http://feeds.newyorker.com/services/...everything.xml Link Element: Link Apply extractor to linked content is checked Link Reformatter: {0}?printable=true Content Extraction pattern: <!-- start article rail -->(.*) <!-- end article body --> Then I changed computers, installed the latest .net updates, downloaded Web2Book, and duplicated the settings and it's not working. I only get the article headings - it doesn't seem to be following the link. Any ideas? What's changed? thanks, Andy |
09-19-2007, 12:01 AM | #222 |
Junior Member
Posts: 6
Karma: 10
Join Date: Sep 2007
Location: Hesperia, CA
Device: Sony Reader PRS500 / iPhone 3GS / iPad 32GB
|
Converting webpages (located on my computer) to PDF
I just bought a Sony Reader last week. It is great.
Here is my problem: I have about 3,000 webpages that are on my local computer. Each one is a conversion of a single book. I have tried to convert them to PDF by opening them in Internet Explorer, and using the local address as the URL in Web2book. Web2book gives me the following message: -------------------------------------------------------------------------- System.UriFormatException: Invalid URI: A port was expected because of there is a colon (':') present but the port could not be parsed. at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind) at System.Uri..ctor(String uriString) at web2book.Utils.GetUrlResponse(String url, String& error, String postData, ICredentials creds, String contentType) at web2book.Utils.GetWebResponse(String url, String& error, String postData, ICredentials creds, String contentType) at web2book.Utils.GetContent(String link, String html, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log) at web2book.Utils.GetHtml(String url, Int32 numberOfDays, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log) at web2book.WebPage.GetHtml(ISource mySourceGroup, Int32 displayWidth, Int32 displayHeight, Int32 displayDepth, StringBuilder log) at web2book.MainForm.AddSource(ContentSourceList sourceClass, ContentSource source, Boolean isAutoUpdate) -------------------------------------------------------------------------- I can get around this by posting each webpage on my Geocities site, but that is a lot of extra work. Any idea how I can convert the local html file without doing all that? Thanks!! |
09-23-2007, 03:39 PM | #223 |
Junior Member
Posts: 9
Karma: 10
Join Date: Aug 2007
Device: Sony Reader
|
Help with regular expression
I'm trying to create a Web2Book feed for
http://www.spiegel.de/schlagzeilen/rss/0,5291,,00.xml I would like to rewrite the links to link to the printable version, but the pattern to replace the link is somewhat complex: The link in the feed looks like this: http://www.spiegel.de/politik/auslan...506744,00.html The printable version like this: http://www.spiegel.de/politik/auslan...506744,00.html From what I can see by examining other links the constants are: - http://www.spiegel.de/ (obviously) - one or more folder names - the actual file name consists of three numbers separated by comma - in the printable version, the string "druck-" is added before the third number - the extension is .html I'm not so good with RegEx, help would be appreciated. |
09-24-2007, 06:00 AM | #224 | |
RSS &amp; Gadget Addict!
Posts: 82
Karma: 67
Join Date: May 2005
Location: Albuquerque, NM
Device: Sony PRS-500, iPod Touch, iPhone
|
Quote:
then in the link constructor you could use {1}druck-{2} I'm all ears for a more efficient regex that is more efficient. -adin |
|
09-24-2007, 09:10 PM | #225 |
Junior Member
Posts: 9
Karma: 10
Join Date: Aug 2007
Device: Sony Reader
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
rss2book release 20 now available | geekraver | Sony Reader | 4 | 01-26-2007 02:36 PM |
rss2book release 19 | geekraver | Sony Reader | 2 | 12-30-2006 11:51 AM |
rss2book release 18 | geekraver | Sony Reader | 0 | 12-22-2006 04:57 AM |
rss2book release 16 | geekraver | Sony Reader | 1 | 12-13-2006 06:56 AM |
rss2book release 13 | geekraver | Sony Reader | 0 | 11-13-2006 03:41 AM |