![]() |
#1 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 264
Karma: 3000000
Join Date: Nov 2015
Device: none
|
Opinions on Archive.org as free ebook source?
I'm finding that I'm relying more and more on them compared to other sources. Project Gutenberg, Standard Ebooks.... They don't go with reflowable text, they just offer scans of original books in as high quality as possible. They do offer epubs, in some cases, but they're often poorly formatted. There's flawed ocr, and no custom proofreading, then again, none of that is needed. Proofreading is often a must, and a problem on epubs. Corrupted pages, ink splats, or just ocr that misbehaved, all that can cause text errors that then drag across various commercial or free ebooks for years. In a scanned page, mistakes are plainly visible, so the reader can see and often deduce what the word that's corrupted was.
Then there's another benefit. Illustrations. You rarely see them even in commercial ebooks. And when they are available, they're often of low quality. On archive.org, you can choose from various editions, compare which looks the best and go with it. There is size and availability concern, though. Some books can't be downloaded, and when they are files tend to be quite big. Massive compared to epubs. Likewise, pdfs can be of worse quality than processed jpegs, so I often download them instead and convert them into pdfs on my own. Last edited by poohbear_nc; 06-18-2024 at 01:05 PM. |
![]() |
![]() |
![]() |
#2 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,350
Karma: 234567906
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
|
I only read reflowable ebooks, PDF is out of the question for me. Too clumsy and uncomfortable even on a computer, let alone ereader. And I couldn't care less about illustrations. Give me pure text any day.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,617
Karma: 102020751
Join Date: Apr 2011
Device: pb360
|
I think of the (public domain) scans on archive.org as valuable raw material for ebook producers that can't or don't want to do the scanning themselves. (Just as Project Gutenberg can be a starting point for someone wanting to produce high quality formatting.)
|
![]() |
![]() |
![]() |
#4 |
Reading till the spring
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,871
Karma: 96999999
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Archive Org is only good for PDF scans. You need to check they are really PD and do your own clean up / OCR / Proof. Their OCR is almost enough just for search.
You need a decent QHD screen, better than basic HD anyway, to read them. Often the quality is poor. |
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,470
Karma: 9202958
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Yes, definitely agree.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Gentleman and scholar
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,412
Karma: 110962859
Join Date: Jun 2015
Location: Space City, Texas
Device: Clara BW; Nook ST w/Glowlight, Paperwhite 3
|
I personally wouldn't enjoy reading one of those Archive.org (or OpenLibrary) book scans on my reader. They are a mess.
But they can serve as valuable raw material and I'm glad they are doing what they are doing (even the scans of in copyright stuff). |
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,815
Karma: 26985557
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
|
While a lot of it available for ePub download, PDF is almost always better in the end. Some of the scans are pretty good, for example the ones you have to check out an hour at a time.
I've been sufficiently motivated to clean a few up and re-OCR, but it is a lot of work and difficult to automate any part of the workflow for doing so. |
![]() |
![]() |
![]() |
#8 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,928
Karma: 68368377
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
The PDFs scanned by the Internet Archive itself use a Luratech layering/transparency for compression reasons. I've had issues with them on my Kobo devices.
The OCR'd ePubs I consider a lost cause. |
![]() |
![]() |
![]() |
#9 |
want to learn what I want
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,481
Karma: 7030383
Join Date: Sep 2020
Device: none
|
Sometimes I see very low "brightness" in Archive.org scans.
Then, a while ago, when looking at one of those, I found an awesome Windows program that allows adjusting all images in a PDF file, in just one click: https://www.naps2.com/ Last edited by Comfy.n; 06-18-2024 at 07:48 PM. |
![]() |
![]() |
![]() |
#10 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,470
Karma: 9202958
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
I don't use it for the pdf adjustment, but I do us the batch scanning mode which works really well. =edit= Thanks for the reminder. I see I am a bit out of date using v6.1. Best upgrade to the latest version. |
|
![]() |
![]() |
![]() |
#11 |
want to learn what I want
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,481
Karma: 7030383
Join Date: Sep 2020
Device: none
|
|
![]() |
![]() |
![]() |
#12 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,200
Karma: 58221465
Join Date: Jan 2007
Location: Peru
Device: KINDLE: Oasis 3, Scribe (1st), Matcha; KOBO: Libra 2, Libra Colour
|
I just do this in calibre:
VIEW (open the PDF file) FILE EXPORT QUARTZ FILTER (change to BLACK & WHITE) SAVE I then take the saved file from Documents (Macbook Air) and load the file onto calibre and then transfer it to my Kindle Scribe. Beautiful. The covers are in black and white, but I don't care. Then (pay attention ![]() Most of my PDF files come from Archive.org WONDERFUL. SIMPLE. ENJOYABLE. PDF FILE. KINDLE SCRIBE. YEAH! Last edited by Dr. Drib; 06-18-2024 at 09:33 PM. |
![]() |
![]() |
![]() |
#13 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 621
Karma: 12345678
Join Date: Jan 2015
Location: Canada
Device: none
|
While they are not my favorite source of Public Domain books I have grabbed a number of their PDFs.
I use EBookDroid to read them on my tablet as it provides a number of options to improve PDF readability. |
![]() |
![]() |
![]() |
#14 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 310
Karma: 2765231
Join Date: Jul 2023
Device: Scribe, OA2, Glo HD, PRS-350
|
Quote:
|
|
![]() |
![]() |
![]() |
#15 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,000
Karma: 76440556
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
There seems to be some multi core discussions at https://sourceforge.net/p/naps2/discussion/general/
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Archive.org ePub | Ghitulescu | ePub | 12 | 06-01-2021 03:55 AM |
archive.org downloads | abrogard | Calibre | 2 | 08-11-2018 07:08 PM |
Archive.org | crutledge | General Discussions | 129 | 08-28-2015 07:22 AM |
Making Table of Contents on free eBook files from Archive.org | automa | Sigil | 2 | 11-18-2012 08:00 AM |
Archive.org opens huge ebook lending library | rogue_librarian | News | 37 | 02-27-2011 09:16 AM |