Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Other formats > LRF

Notices

Reply
 
Thread Tools Search this Thread
Old 08-29-2007, 08:34 PM   #211
geekraver
Addict
geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.
 
Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
Quote:
Originally Posted by guardianx View Post
What do i do with this info? sorry i'm new.
You use the Subscribe option on the File menu.
geekraver is offline   Reply With Quote
Old 09-02-2007, 06:42 PM   #212
dietric
Junior Member
dietric began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Aug 2007
Device: Sony Reader
Quote:
Originally Posted by dietric View Post
I dont want to alarm anyone unnecessarily, but McAffee VirusScan reports that the temporary files created thru the RSS conversion process are infected with the Exploit-ObscureHtml trojan. Tis might well be VirusScan being overzealous aobut the HTML content, but yo should know nevertheless (since it also prevents the program from working correctly).
Would the developer be inclined to look into this problem? I would really love to use this software, but mentioned problems prevents me form doing so.

Best
-ds
dietric is offline   Reply With Quote
Advert
Old 09-02-2007, 08:00 PM   #213
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 76,465
Karma: 136564696
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by dietric View Post
Would the developer be inclined to look into this problem? I would really love to use this software, but mentioned problems prevents me form doing so.

Best
-ds
McAffee is giving you a false positive. Either update McAffee or find a virus scanner that actually works. Or you could always turn it off, get your RSS feed sorted and then turn it back on.
JSWolf is offline   Reply With Quote
Old 09-05-2007, 02:14 PM   #214
guardianx
Junior Member
guardianx began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Aug 2007
Device: sony ereader
Quote:
Originally Posted by geekraver View Post
You use the Subscribe option on the File menu.
when i do that the program crashes. everything i do the program crashes wtf.
i followed the d/l instruction. I'm not that a newbie when it comes to computer. I guess i'm out of luck then oh well i will stick with book design. when all of the bugs is fixed i will give this program another shot.

Last edited by guardianx; 09-05-2007 at 02:22 PM.
guardianx is offline   Reply With Quote
Old 09-06-2007, 06:30 PM   #215
squeezebag
Junior Member
squeezebag began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jun 2007
Device: Sony Reader
Anyone else having problem with the subscribe or publish functions? I'm using version 24 and it hangs everytime i invoke either.
squeezebag is offline   Reply With Quote
Advert
Old 09-06-2007, 07:30 PM   #216
flamaest
Groupie
flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.flamaest ought to be getting tired of karma fortunes by now.
 
Posts: 155
Karma: 1044459
Join Date: Jul 2007
Device: prs-500
This tool definitely has its merits and I used it for a long time.

Honestly, after feedbooks.com showed up with their newspaper feature and their synchronization tool for my Sony Reader, I can now dock my reader and load up all my RSS feeds from feedbooks in seconds.

I still do appreciate RSS2book for introducing me to properly formatted PDF RSS feeds and for those stubborn websites with limited RSS feeds.

F.
flamaest is offline   Reply With Quote
Old 09-07-2007, 07:35 PM   #217
geekraver
Addict
geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.geekraver ought to be getting tired of karma fortunes by now.
 
Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
Quote:
Originally Posted by squeezebag View Post
Anyone else having problem with the subscribe or publish functions? I'm using version 24 and it hangs everytime i invoke either.
The problem is my DSL speed. Verizon cannot upgrade me as I am on frame relay and there is no ATM or FIOS available in my area. I will look into other solutions for hosting this.
geekraver is offline   Reply With Quote
Old 09-09-2007, 06:48 AM   #218
angrytrousers
Junior Member
angrytrousers began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2007
Device: SPH-A580
Hi! Great program.

Is it possible to generate SEPARATE pdfs for each story in a feed?
I'm trying to create an archive of stories from a particular site, and I'd rather have separate pdfs than one giant one with a months worth of stories.

HTMLDoc doesn't seem to natively have this feature either. Maybe I'd have to recursively run your program for each link?

Thanks!
angrytrousers is offline   Reply With Quote
Old 09-11-2007, 06:23 PM   #219
toomanybarts
Junior Member
toomanybarts began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jul 2007
Device: Sony Reader
If someone can help me understand how I would pull content from the following website (using the "Web Page Tab of rss2book) it will go a long way to me understanding not only how this program works, but also the REGEX expressions rqd to get at the content (and only the content) we are all using this program for :
"http://www.timesonline.co.uk/tol/comment/columnists/jeremy_clarkson/"

There are a number of links on the page that reference the various blog entries I want to pull, but when I change rss2book settings for "followlinks" to depth 2 (or more) I get this error
"Processing clarkson

System.UriFormatException: Invalid URI: The URI scheme is not valid.
at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
at System.Uri..ctor(String uriString)
at web2book.Utils.ExtractContent(String contentExtractor, String contentFormatter, String url, String html, String linkProcessor, Int32 depth, StringBuilder log)
at web2book.Utils.GetContent(String link, String html, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log)
at web2book.Utils.GetHtml(String url, Int32 numberOfDays, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log)
at web2book.WebPage.GetHtml(ISource mySourceGroup, Int32 displayWidth, Int32 displayHeight, Int32 displayDepth, StringBuilder log)
at web2book.MainForm.AddSource(ContentSourceList sourceClass, ContentSource source, Boolean isAutoUpdate)"

IF I leave it set at 1 I get
"Processing clarkson

Final content:
===================

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta http-equiv="Content-Type" content="text/html;charset=utf-8" /><meta name="ROBOTS" content="NOARCHIVE" /><script type="text/javascript">
// Variables required for DART. MUST BE IN THE HEAD.
var time = new Date();
randnum = (time.getTime());
</script><!-- Code to display title of the HTML page --><title> Jeremy Clarkson Columns & Comment | Times Online </title><meta name="Description" content="The UKs favourite motoring journalist comments on British society and culture in his weekly columns on Times Online"><link rel="shortcut icon" type="image/x-icon" href="/tol//img/favicon.ico" type="image/x-icon" /><link rel="stylesheet" type="text/css" href="/tol/css/alternate.css" title="Alternate Style Sheet" /><link rel="stylesheet" type="text/css" href="/tol/css/tol.css"/>
<link rel="stylesheet" type="text/css" href="/tol/css/ie.css"/><link rel="stylesheet" type="text/css" href="/tol/css/typography.css"/><script language="javascript" type="text/javascript" src="/tol/js/tol.js"></script></head><body><div id="top"/><div id="shell"><div id="page"><!-- START REVENUE SCIENCE PIXELLING CODE --><script language="javascript" type="text/javascript" src="/tol/js/DM_client.js"></script><script language="javascript" type="text/javascript">
DM_addToLoc("Network",escape("Times"));
DM_addToLoc("SiteName",escape("Times Online"));
</script><script language="javascript" type="text/javascript">
// Index page for Revenue sciences"


....there's loads more, this is just part of the content. The point is, I thought that changing the "Follow links to Depth" setting to 2 would grab not only the page referred to in the URL, but also follow the links from that URL's page?
I would then need to work on what REGEX would be needed to tidy up the resulting mass of content. (That would be problem / lesson 2, but one thing at a time!)

Am I missing something?
(I realise there is a RSS feed page where I can pull the current top 4 or 5 blog entries and adinb has helped me clean this up to be readable, what I want to understand is how do I manipulate Webpages)

(Thank-you again to adinb who has been helping me with this problem using the rss feed and the "Feed" tab of rss2book, via PM, it's people like him that keep these types of forums useful...I thought it may be useful for others to understand how it all works and to lighten the load on adinb!)

Thank-you all in advance.
toomanybarts is offline   Reply With Quote
Old 09-13-2007, 12:44 AM   #220
Liviu_5
Books and more books
Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.
 
Liviu_5's Avatar
 
Posts: 917
Karma: 69499
Join Date: Mar 2006
Location: White Plains, NY, USA
Device: Nook Color, Itouch, Nokia770, Sony 650, Sony 700(dead), Ebk(given)
Error "Thu, 13 Sep 2007 06:31:55 EEST is out of range"

Hi,

I tried to use Rss2book to pull down some newspapers feeds, one worked nicely after I figured out a good regex to get just the text, but for the other whatever I try I get the following message repeated as many times as the #feeds and of course with the appropriate time/date I try (US Eastern +7 hrs - so I tried at 11.31 pm US Eastern, I get exactly the following, I try seven minutes later I get the message with 06.38...):

Processing Evenimentul
Thu, 13 Sep 2007 06:31:55 EEST is out of range
Thu, 13 Sep 2007 06:31:55 EEST is out of range
....

Is there anything I can do about it?

The feed link is not in English, but the same was true for the other newspaper that works just fine:

http://www.evz.ro/rss.php/evz.xml
Liviu_5 is offline   Reply With Quote
Old 09-13-2007, 04:07 PM   #221
squeezebag
Junior Member
squeezebag began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jun 2007
Device: Sony Reader
Okay. I'm must be losing my mind.

I've been able to extract the The New Yorker with the following setup:

URL: http://feeds.newyorker.com/services/...everything.xml
Link Element: Link
Apply extractor to linked content is checked
Link Reformatter: {0}?printable=true
Content Extraction pattern: <!-- start article rail -->(.*) <!-- end article body -->

Then I changed computers, installed the latest .net updates, downloaded Web2Book, and duplicated the settings and it's not working. I only get the article headings - it doesn't seem to be following the link.

Any ideas? What's changed?

thanks,
Andy
squeezebag is offline   Reply With Quote
Old 09-19-2007, 12:01 AM   #222
rkellmer
Junior Member
rkellmer began at the beginning.
 
rkellmer's Avatar
 
Posts: 6
Karma: 10
Join Date: Sep 2007
Location: Hesperia, CA
Device: Sony Reader PRS500 / iPhone 3GS / iPad 32GB
Unhappy Converting webpages (located on my computer) to PDF

I just bought a Sony Reader last week. It is great.

Here is my problem: I have about 3,000 webpages that are on my local computer. Each one is a conversion of a single book. I have tried to convert them to PDF by opening them in Internet Explorer, and using the local address as the URL in Web2book. Web2book gives me the following message:
--------------------------------------------------------------------------
System.UriFormatException: Invalid URI: A port was expected because of there is a colon (':') present but the port could not be parsed.
at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
at System.Uri..ctor(String uriString)
at web2book.Utils.GetUrlResponse(String url, String& error, String postData, ICredentials creds, String contentType)
at web2book.Utils.GetWebResponse(String url, String& error, String postData, ICredentials creds, String contentType)
at web2book.Utils.GetContent(String link, String html, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log)
at web2book.Utils.GetHtml(String url, Int32 numberOfDays, String linkProcessor, String contentExtractor, String contentFormatter, Int32 depth, StringBuilder log)
at web2book.WebPage.GetHtml(ISource mySourceGroup, Int32 displayWidth, Int32 displayHeight, Int32 displayDepth, StringBuilder log)
at web2book.MainForm.AddSource(ContentSourceList sourceClass, ContentSource source, Boolean isAutoUpdate)
--------------------------------------------------------------------------
I can get around this by posting each webpage on my Geocities site, but that is a lot of extra work. Any idea how I can convert the local html file without doing all that?

Thanks!!
rkellmer is offline   Reply With Quote
Old 09-23-2007, 03:39 PM   #223
dietric
Junior Member
dietric began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Aug 2007
Device: Sony Reader
Help with regular expression

I'm trying to create a Web2Book feed for
http://www.spiegel.de/schlagzeilen/rss/0,5291,,00.xml

I would like to rewrite the links to link to the printable version, but the pattern to replace the link is somewhat complex:
The link in the feed looks like this:
http://www.spiegel.de/politik/auslan...506744,00.html
The printable version like this:
http://www.spiegel.de/politik/auslan...506744,00.html

From what I can see by examining other links the constants are:
- http://www.spiegel.de/ (obviously)
- one or more folder names
- the actual file name consists of three numbers separated by comma
- in the printable version, the string "druck-" is added before the third number
- the extension is .html

I'm not so good with RegEx, help would be appreciated.
dietric is offline   Reply With Quote
Old 09-24-2007, 06:00 AM   #224
adinb
RSS &amp;amp; Gadget Addict!
adinb is on a distinguished road
 
adinb's Avatar
 
Posts: 82
Karma: 67
Join Date: May 2005
Location: Albuquerque, NM
Device: Sony PRS-500, iPod Touch, iPhone
Quote:
Originally Posted by dietric View Post
I'm trying to create a Web2Book feed for
http://www.spiegel.de/schlagzeilen/rss/0,5291,,00.xml

I would like to rewrite the links to link to the printable version, but the pattern to replace the link is somewhat complex:
The link in the feed looks like this:
http://www.spiegel.de/politik/auslan...506744,00.html
The printable version like this:
http://www.spiegel.de/politik/auslan...506744,00.html

From what I can see by examining other links the constants are:
- http://www.spiegel.de/ (obviously)
- one or more folder names
- the actual file name consists of three numbers separated by comma
- in the printable version, the string "druck-" is added before the third number
- the extension is .html

I'm not so good with RegEx, help would be appreciated.
how about (http://www.spiegel.de.*/\d,\d{4},)(\d+,\d\d\.html)
then in the link constructor you could use {1}druck-{2}

I'm all ears for a more efficient regex that is more efficient.

-adin
adinb is offline   Reply With Quote
Old 09-24-2007, 09:10 PM   #225
dietric
Junior Member
dietric began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Aug 2007
Device: Sony Reader
Quote:
Originally Posted by adinb View Post
how about (http://www.spiegel.de.*/\d,\d{4},)(\d+,\d\d\.html)
then in the link constructor you could use {1}druck-{2}

I'm all ears for a more efficient regex that is more efficient.

-adin
That worked out great, thanks. I have tested and published the feed.
dietric is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
rss2book release 20 now available geekraver Sony Reader 4 01-26-2007 02:36 PM
rss2book release 19 geekraver Sony Reader 2 12-30-2006 11:51 AM
rss2book release 18 geekraver Sony Reader 0 12-22-2006 04:57 AM
rss2book release 16 geekraver Sony Reader 1 12-13-2006 06:56 AM
rss2book release 13 geekraver Sony Reader 0 11-13-2006 03:41 AM


All times are GMT -4. The time now is 11:05 PM.


MobileRead.com is a privately owned, operated and funded community.