03-20-2009, 09:49 PM | #1 |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
Sony and google books, anyway to bulk download all free books?
Just wondering that now that sony and goodle offer all their old books collection in epub format, is there a way to bulk download it all. I M A HUGE book collector, so having scans of the originals as Google does is 100% better to me, so i was wondering if there was a way torrent or other to get the free collection other than clicking one by one... i ll grow to an old man before i m done otherwise !
any tips are very very welcomed ! |
03-20-2009, 10:15 PM | #2 |
creator of calibre
Posts: 44,417
Karma: 23977332
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
you could probably use some sort of web scraper to do it.
|
Advert | |
|
03-20-2009, 10:23 PM | #3 |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
You know that is a great idea! I bet pagenest would do it! I was getting them from the sony software but i bet there is a web page were they can be found! Thanks for the tip i m on the trail !
|
03-20-2009, 10:42 PM | #4 |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
Hmm seems these epub books are only downloadable from the sony software, so a web scraper might not work in this case. Unless there is a web page that host them... but http://books.google.com/ doesn t seem to offer any download option for the epub format...
|
03-21-2009, 01:57 AM | #5 |
creator of calibre
Posts: 44,417
Karma: 23977332
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
the google books data api may have calls to access the books, though I'm not sure
|
Advert | |
|
03-21-2009, 03:40 PM | #6 |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
It probably would request a password from the site, but that would mean writing an application to authenticate the download and request from the server the files...
Coming back to the web scraper idea, i tried the google book search page and found out all books are there and in PDF format, they are all scanned so i m guessing there must be an easy way to convert from pdf scan to epub scan, much like pdf read does for lrf. So i got myself a web site scraper and i m downloading right now... up to 3 gigs and coming ! thanks for the help! ps: Files seems quite large, so this might not be the best idea, but i ll let it go and see how big it gets... Last edited by Student1; 03-21-2009 at 03:47 PM. |
03-21-2009, 07:55 PM | #7 |
Guru
Posts: 988
Karma: 12653
Join Date: Apr 2008
Device: None of your business
|
There are roughly 1 million PD books on Google, if we average the PDF size at say 10 megs, you are talking 10TB of data, you might be biting off more then you can chew...
-MJ |
03-21-2009, 08:15 PM | #8 |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
I was thinking about that... 5 gigs and counting now! There are alot of german and spanish books i ll just delete. Well see when i start needing space... left one 36o gig hd just for this... once it gets to 100 gigs i ll have to do something
Wished the versions available would be the epub ones... as these pdfs will need some pdfread love to convert! Last edited by Student1; 03-21-2009 at 08:19 PM. |
03-22-2009, 05:06 AM | #9 |
Guru
Posts: 988
Karma: 12653
Join Date: Apr 2008
Device: None of your business
|
Hmm, someone is welcome to check my math but if it took you 5 hours to go from 3 gigs to 5 gigs, that is 1GB per 2.5 hours, with 10TB roughly to go you should be done in about 5 to 6 months... But I like the idea, if I had the space and the bandwidth (I'm not -that- patient) I'm sick enough to do it too...
I'd rather have the PDF's then the epubs for archiving, you can always reOCR the PDF's or use them as visual reference for correcting an epub whereas with the epub you are stuck with the original OCR... -MJ |
03-22-2009, 05:10 AM | #10 | |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
Quote:
I wonder what they used to automate the ocr process of all those books. Can't believe they manually did them all... wouldn t make sense! |
|
03-22-2009, 09:58 AM | #12 |
Evangelist
Posts: 478
Karma: 451808
Join Date: Feb 2009
Location: California, USA
Device: my two eyes, KLiiK, Sony PRS-700
|
Wow....I don't believe anyone would collect digital books like this. Why are you doing it? It's so curious, from my point of view It's not the same as collecting physical books, is it? So many books on Google Books that I've searched under the keywords "ethnography", "anthropology" and the like are just crappy (excuse me) society transactions and old stuff that are out-dated now in the field.
|
03-22-2009, 10:07 AM | #13 |
Guru
Posts: 988
Karma: 12653
Join Date: Apr 2008
Device: None of your business
|
I have perhaps 50-100 texts from Google I've grabbed for reference. Many of which are texts that haven't been reprinted and the originals are quite rare. But I do like the idea of having anything available... And I've seen things disappear from access way too many times to go with the logic that it'll be there when you need it... Everyone's tastes and needs are different....
-MJ |
03-22-2009, 09:20 PM | #14 |
Groupie
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
|
Personally i want to create a very very good digital library. Like mjh215 said, its there now but who knows if it will always be there! These digital copies are almost better than the real thing, these never get old while books age and need to be replaced eventually.
Old books are like an inprint of time, sure archeology, history views might have changed, but its almost a social study to see how society thought at one time. Hence why these book are of much value to me ! Last edited by Student1; 03-22-2009 at 09:29 PM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Google Books Offers ePub Downloads Of Free Books | AprilHare | News | 19 | 05-17-2011 10:02 PM |
Sony, Google and Barnes & Noble To Partner For Sales of Google Books [April Fools] | NatCh | News | 73 | 04-07-2009 08:48 AM |
Five free books for download | MsAstoria | Amazon Kindle | 16 | 03-05-2009 12:41 PM |
Google Download: No iTunes for Books - BusinessWeek | SoCal Bob | News | 5 | 01-24-2007 07:31 PM |
Google to sell e-books that you cannot download | Alexander Turcic | News | 4 | 03-14-2006 03:46 AM |