07-13-2007, 08:44 PM | #1 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
web2lrf
Building on my work with web2disk, here's web2lrf (part of libprs500 v0.3.70).
It directly converts websites into LRF files. More than that it has support for profiles that allow it to preprocess websites to generate better looking LRF files. Right now it knows about the New York Times, The BBC, The Economist and Newsweek (see attached demos). To use it with a profile: Code:
web2lrf profilename Code:
web2lrf newsweek Code:
web2lrf --username myusername --password mypassword nytimes https://libprs500.kovidgoyal.net/wiki/UserProfiles for instructions and examples. To use it with an arbitrary website (it wont do any preprocessing) Code:
web2lrf --url http://mywebsite.com default Last edited by kovidgoyal; 11-28-2007 at 02:23 AM. |
07-14-2007, 04:52 PM | #2 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Released v0.3.62 with a newsweek profile. See the attached demo in the first post. Since I actually read Newsweek, I've taken a little more pain over this profile, it has a nice hierarchical TOC.
Ironic since for some reason I haven't been getting my newsweeks for the past month ;-) |
Advert | |
|
07-14-2007, 06:17 PM | #3 |
Technogeezer
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
|
I quickly uninstalled the prior version and installed the new one. Tried it.
"TypeError: option_parser takes() takes no arguments (1 given)" Even the nytimes demo gave the same result. |
07-14-2007, 06:26 PM | #4 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Oops typo...released 0.3.73 with the fix. It'll take 20 mins to reach the servers.
|
07-15-2007, 02:03 AM | #5 |
Addict
Posts: 274
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
|
Kovidgoyal, thanks a lot for all your work.
NYTimes and BBC work fine for me, Newsweek gives an error message. I'm looking forward for more profiles :-) C:\Temp>web2lrf newsweek Fetching feeds... done Downloading .WARNING: Could not fetch link file://c:\docume~1\davidd~1\locals~1\ temp\libprs500oezp07\index.html file://c:\docume~1\davidd~1\locals~1\temp\libprs500oezp07 \index.html saved to Traceback (most recent call last): File "convert_from.py", line 124, in <module> File "convert_from.py", line 116, in main File "convert_from.py", line 74, in create_lrf File "libprs500\ebooks\lrf\html\convert_from.pyo", line 1233, in process_file File "libprs500\ebooks\lrf\html\convert_from.pyo", line 1431, in get_path File "libprs500\__init__.pyo", line 74, in extract Exception: Unknown archive type |
Advert | |
|
07-15-2007, 03:48 AM | #6 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
windows strikes again! fixed in 0.3.74.
As for new profiles, nothing planned as my needs are met. But feel free to contribute :-) |
07-15-2007, 09:33 AM | #7 |
Junior Member
Posts: 4
Karma: 10
Join Date: Jul 2007
Device: Sony PRS-500
|
You my friend, are a god. I just ordered my PRS-500 (not here yet but I can't wait), and was preemptively searching for ways to do this-- realize I'm jumping the gun here, but thanks so much.
Also great to see it offered for Linux |
07-15-2007, 09:59 AM | #8 |
Resident Curmudgeon
Posts: 76,395
Karma: 136466962
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
07-15-2007, 12:50 PM | #9 |
Technogeezer
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
|
Still coming up 0.3.73 for me.
|
07-15-2007, 02:36 PM | #10 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Sorry uploading now.
|
07-15-2007, 02:39 PM | #11 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It was developed for linux, windows and OSX support came later. Indeed it would not have been possible without all the other great free software that's been developed for linux.
|
07-16-2007, 05:17 PM | #12 |
Addict
Posts: 274
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
|
Kovidgoyal, Newsweek now works for Windows, but the output is a rss type small file. Articles are not pulled.
I have few basic questions, sorry couldn't find in many pages of stickies. How can I make the font smaller? Do I need to load more (or different) fonts to the reader or it's a setting somewhere for web2lrf? Where can I find the profiles for NYTimes (or BBC or Newsweek)? I'd like to use it as a good template for other sites. Thanks in advance, David (not a power user) |
07-16-2007, 08:33 PM | #13 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You can see the profiles here https://libprs500.kovidgoyal.net/bro...web?order=name
The newsweek error will be fixed in the next release. |
07-20-2007, 03:06 PM | #14 | |
Junior Member
Posts: 4
Karma: 10
Join Date: Jul 2007
Device: Sony PRS-500
|
Quote:
Nonetheless, once that was done I tried out the NYTime script and it worked like a charm. Thank you so much for the hard work. |
|
07-20-2007, 05:53 PM | #15 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You only need convertlit if you plan on converting lit files. The rest of libprs500 will work just fine without it.
|
Tags |
libprs500, web2lrf |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
web2lrf to capture blog archive? | Deputy-Dawg | Sony Reader Dev Corner | 1 | 02-15-2008 12:41 AM |
web2lrf: La Repubblica | alexxxm | Sony Reader | 1 | 11-13-2007 01:27 PM |