Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-13-2008, 01:04 AM   #16
recycledelectron
Groupie
recycledelectron has learned how to read e-booksrecycledelectron has learned how to read e-booksrecycledelectron has learned how to read e-booksrecycledelectron has learned how to read e-booksrecycledelectron has learned how to read e-booksrecycledelectron has learned how to read e-booksrecycledelectron has learned how to read e-books
 
Posts: 152
Karma: 854
Join Date: Dec 2007
Device: Lifebook T5010
Quote:
Originally Posted by vivaldirules View Post
My apologies! I think your efforts are valiant, recycledelectron,
When I referred to illegetimi, I was referring to the copyright mafiAA. I hat it when someone is told they can not legally do something.

The only law should be to not deprive anyone of their life, liberty, or property, except in self defense or in the defense of an innocent person. Ripping a book that is not available as an eBooks is NOT wrong, as it does not deprive anyone of live, liberty, or property.

Telling someone to give up is very distasteful, as it discourages innovation. Innovation is what allows me to live in an air conditioned home, use PCs, and go hunting instead of getting eaten by big predators.

Quote:
Originally Posted by vivaldirules View Post
but I don't think this is an activity for the faint at heart. If a 6.2 Mpixel image is not good enough for a textbook page, my heart flutters to imagine what is.
Actually, the 6.2MP camera works fine when correctly focused, but the auto-focus causes me problems. It will get 2/3 of the page fine, but the print near the edge ia a problem when taking in a large page. Therefore, a 6MP DSLR or a 9MP point-and-shoot should work on the worst text books.

Digital cameras are dropping in price so fast, that if the camera's price fazes you, wait a semester and they will be cheaper.

Quote:
Originally Posted by vivaldirules View Post
And "paging" through a book by flipping between jpegs that total 1 Gbyte or more has me swooning. I'm glad this works for you but this won't for me - ever. I need a process that is a lot less intense and time-consuming.
I can photo 500 pages an hour. Then, they copy at a rate of several thousand pages and hour to my PC via USB from a card reader. During that time, I can rename them. This is necessary because I snap pics of the odd pages first, and then do the even pages. After I rename them with the page number as the name, they fall in alphabetical order.

It takes a day or two over a weekend to digitize all my text books for that semester, so count off maybe 2 weekends a year to relive myself of carrying a dozen text books at a time. Instead, I'm the one with the tiny notepad-sized case.

My personal library beats the university library, and fits in the passenger seat of my pickup.

As for the GB size, my PRS-505 changes to the next pic as quickly as it flips between pages in a PDF. The zoom works MUCH better on JPEGs than it does on PDFs. I like JPEGs better than PDFs on the PRS-505.

Quote:
Originally Posted by vivaldirules View Post
I have images of you in a dark basement frantically turning pages.
Good lighting is essential to good book ripping ;o)

Quote:
Originally Posted by vivaldirules View Post
Sorry for having fun with you but I couldn't help it. I certainly hope the image I have is wrong!
You are very wrong. I spend 2 weekends a year digitizing my text books, and am the only person on the faculty who does not drag home massive bags of books. I grab my eBook reader, and a note pad in a small case, and go with that. I've got everything I need right there.

Last semester, during finals week, a student walked up to me while I was eating lunch on campus and asked if I had graded his paper. I had previously scanned it and the other papers with an ADF, and saved it on a SD card. While he watched, I dropped the right card in my eBook, graded it, and recorded the grade on a note pad to mark in my online grade book later. He was astounded that I had everything right there in a 1-pound package.

Andy

P.S.

Most people don't read or study.

What would happen if you always had access to every book ever written, and could instantly switch from reading to listening to the audio book at that exact word? (When you get bored, when your eye get tired, or when you have to drive somewhere, you switch seamlessly.)

Could a bright, self-motivated kid get an education in the world's least competent school?

What if that reader did not depend on any outside technology? (i.e., it was solar powered, and rugged like a tennis shoe.)

Think of the regimes that have burned books. Could a government keep its people ignorant?

What would happen to the self reliance of individuals, when they can bring up a manual on auto repair on the side of the road?

How much better off would a patient be, if they could pull up a beginner's medical text when trying to understand a life-changing diagnosis? I've driven to the hospital, and would have liked to find the passage, then ask the eBook to read it to me.
recycledelectron is offline   Reply With Quote
Old 01-13-2008, 10:42 AM   #17
vivaldirules
When's Doughnut Day?
vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.vivaldirules ought to be getting tired of karma fortunes by now.
 
vivaldirules's Avatar
 
Posts: 10,059
Karma: 13675475
Join Date: Jul 2007
Location: Houston, TX, US
Device: Sony PRS-505, iPad
Well, recycledelectron, I'm very impressed. My apologies, again! A day or two to do several textbooks might be acceptable even for me. Also, using JPEGs instead of PDFs put me off but I agree with you that the zooming and panning works fine and I wish Sony supported that for PDFs. But how do you deal with accessing page 123 and then flipping to page 812? Do you advance ten pages (images) at a time from the menu or do you use a hack? Also, I assume there's no linkable table of contents. Does that slow you down or do you have a solution for that, too?
vivaldirules is offline   Reply With Quote
Advert
Old 01-19-2008, 08:16 AM   #18
shousa
Groupie
shousa doesn't littershousa doesn't littershousa doesn't litter
 
Posts: 181
Karma: 232
Join Date: May 2006
I have a number of books I am going to convert using recycledelectron's method of camera and tripod.

Any suggestions or tips recycledelectron over and above what you have written so far? eg how close should the camera be, you know the "finer" points.

Like the above question can you access page 300 then back to 200? (not that that would be a deal breaker for me, just wondering.

This seems good?
http://www.wikihow.com/Scan-a-Book-W...Digital-Camera.

Last edited by shousa; 01-19-2008 at 08:48 AM.
shousa is offline   Reply With Quote
Old 01-22-2008, 01:10 PM   #19
jackbrown
Enthusiast
jackbrown began at the beginning.
 
Posts: 36
Karma: 10
Join Date: Apr 2006
Location: San Francisco USA/North Africa
A cheap scanner at 300 dpi (black and white!) and software like Abbyy Finereader is all you need for this. Scanning, OCRing and PDFing a book takes a couple of hours. I do it all the time; you can read something else while you do it.

If you're going to use recycledelectron's method, try to figure out a way to quickly turn the images black and white (not grayscale!) as early in the process as possible, and turn the autofocus off; I used a setup like the one he describes for scanning a rare book, and took color pictures (big mistake); also didn't have good enough lighting for a really high contrast ratio. The resulting images basically sucked and I had a nightmarish time making the ebook. It'd be great if your camera could capture in black and white, but it almost certainly can't, so make sure you white balance it against a blank page in the room you are capturing in, then transform the captured files into bw before you OCR. Good luck, and like I said, I think a cheapo scanner is more practical, unless you need really large format captures.

Last edited by jackbrown; 01-22-2008 at 01:21 PM.
jackbrown is offline   Reply With Quote
Old 01-22-2008, 02:14 PM   #20
philodox
Member
philodox doesn't litterphilodox doesn't litter
 
philodox's Avatar
 
Posts: 21
Karma: 110
Join Date: Dec 2007
Device: Bookeen CyBook Gen 3
I've got a couple old books that are nearly falling apart... might be fun to try a scanner with auto feed. Destroying the books wouldn't be a problem at this point. Are there any decent and cheap ones that will take a scan of each side and keep the pages in the right order?

Once I have the images it would be easy enough [though perhaps time consuming] to reformat them as a PDF and use the built in OCR in Adobe Acrobat. Are there PDF to mobi convertors?

Even though each step may take a long time, if I can get a system working that only requires a small amount of user input between these large steps, it might be worth my while.
philodox is offline   Reply With Quote
Advert
Old 01-22-2008, 02:34 PM   #21
yvanleterrible
Reborn Paper User
yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.yvanleterrible ought to be getting tired of karma fortunes by now.
 
yvanleterrible's Avatar
 
Posts: 8,616
Karma: 15446734
Join Date: May 2006
Location: Que Nada
Device: iPhone8, iPad Air
Quote:
Originally Posted by philodox View Post
I've got a couple old books that are nearly falling apart... might be fun to try a scanner with auto feed. Destroying the books wouldn't be a problem at this point. Are there any decent and cheap ones that will take a scan of each side and keep the pages in the right order?

Once I have the images it would be easy enough [though perhaps time consuming] to reformat them as a PDF and use the built in OCR in Adobe Acrobat. Are there PDF to mobi convertors?

Even though each step may take a long time, if I can get a system working that only requires a small amount of user input between these large steps, it might be worth my while.
Tried that with a circa sixties book. The paper was so bad that the first page actually got shreaded in the scanner, causing a paper block and necessitating a dismanteling of the device to get at the pieces.
The software included with the machine can take care of the order the pages come out, provided you don't make mistakes in feeding.
Do you have Acrobat Pro? I didn't know it did OCR!?!
yvanleterrible is offline   Reply With Quote
Old 01-22-2008, 03:44 PM   #22
aru
Likes to read on e-ink
aru is on a distinguished road
 
Posts: 68
Karma: 50
Join Date: Feb 2007
Location: New Jersey
Device: Kindle DX
Don't forget the Plustek Opticbook 3600, which takes 10-20 sec per page, then if you want it to OCRs it for you. If not it still gets the orientation for even and odd pages right. It has a big button for the next page on the scanner itself, so you don't have to go back and forth to your computer. It scans paperbacks and bound books without problems due to the binding. You only have to open the book 90 degrees. This makes all the difference. In my opinion better than taking pictures with a SLR.
It takes me about an hour to get a reasonable sized book into my PC.
aru is offline   Reply With Quote
Old 01-22-2008, 07:01 PM   #23
AnemicOak
Bookaholic
AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.
 
AnemicOak's Avatar
 
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
Here's an automatic book scanner made with legos...

http://www.geocities.jp/takascience/lego/fabs_en.html
AnemicOak is offline   Reply With Quote
Old 01-22-2008, 07:23 PM   #24
slayda
Retired & reading more!
slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.
 
slayda's Avatar
 
Posts: 2,764
Karma: 1884247
Join Date: Sep 2006
Location: North Alabama, USA
Device: Kindle 1, iPad Air 2, iPhone 6S+, Kobo Aura One
Quote:
Originally Posted by philodox View Post
Are there any decent and cheap ones that will take a scan of each side and keep the pages in the right order?
Check out the Scansnap S510 by Fujitsu for a little over $400. (You can check it out on Amazon but won't get the best price there or try the Fujitsu site.). I have the S500. It works very well and comes with good software. Scans two sides at once & you can load up to 50 pages of 20# paper. The better the paper quality (and the larger) the better the final results. Can scan up to 1200DPI in B&W but I've found that 600 DPI is the best compromise between scan quality & speed.

When not in use it has a very small foot print. It is not TWAIN compliant. Output (as I use it) is searchable PDF. I use Nuance's PDF Converter Assistant to create a RTF file for editing.

The only problem I have had was with very poor paper quality in some cheap paperbacks. That resulted in multiple page feeds on a few occasions but mainly it had numerous OCR errors due to the ink bleeding during the printing process.

If you don't mind destroying the book (i.e. taking the pages apart), I highly recommend it.
slayda is offline   Reply With Quote
Old 01-23-2008, 12:58 AM   #25
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Aru makes a great point, the OpticBook may be a bit of a unitasker, but it's brilliant for scanning books.

In preperation of getting my Sony Reader I went ahead and scanned one of my books. I used to do this when I had a tablet PC so I had some experience. Moving them from OCR'd PDF's to a text format was really where the hrad bit came in.

If you can afford to spend the money on the OpticBook (a bit under 300 at Amazon, I believe) it is the single best investment you can make in this area - you can, as someone mentioned, scan very quickly and watch a movie at the same time.

The next part is the OCR. This is where it gets tricky, as most OCR programs will absolutely make a wreck of things. I would use greyscale here, btw... in my experience, it comes out better than black and white. Your mileage may vary.

Depending on your platform, you'll have a few options available to you. Most the free ones I've tried are crap. The one that comes with Adobe Acrobat is average, and the best I've used is OmniPage Pro (but hard to get a hold of for an individual, very expensive. Maybe your school or business has it.) Once you OCR it into text the laborous process is going through and cleaning it all up.

The book I made took me about 5 hrs all around, I'd say - but this was a test run, and so there were lots of false starts. I imagine it'd take me about 2-3 hrs now, for an average sized book, and I'd call it worth it.

I plan on writing a tutorial about this once I nail down some fine points. In the meantime, I suggest you look here - it's aimed at Tablet PC users, but there is an enormous amount of useful material here on the subject.

OpticBook Tutorial (other methods are mentioned as well on other pages here)
Gideon is offline   Reply With Quote
Old 01-23-2008, 05:28 AM   #26
aru
Likes to read on e-ink
aru is on a distinguished road
 
Posts: 68
Karma: 50
Join Date: Feb 2007
Location: New Jersey
Device: Kindle DX
Hi Gideon, my Opticbook 3600 came with a complete software suite including OCR, effectively a turnkey system including ABBYY finereader Sprint, Presto Page Mgr etc. After I installed the software, everything else was automated (except the proofreading ).

There is a post already that describes the scanner (which btw enticed me to buy it) https://www.mobileread.com/forums/sho...ight=opticbook
You may want to build on that.
aru is offline   Reply With Quote
Old 01-23-2008, 09:11 AM   #27
stxopher
Nameless Being
 
One thing to remember if you are looking at the Plustek scanners is not to confuse the Optibook with their new Book Reader. Looks exactly the same but there's a $300 price difference. If you didn't know there were two appliances with from the same company with the same case, photos and basic purpose (scanning books) you might freak slightly and stop looking.

The new Book Reader has a primary focus more on saving the pages as txt, PDFs, PDF text and audio files. (Yea, that last one was audio files. MP3 and WAVs to be precise.) It seems as if it were designed more for keeping the printed word readable for those of us with failing sight than the Optibooks mission was with the saving and shifting of printed information.

Between the two, the Optibook series is still the best bet for most of us scanning books. Its fairly fast, easy and simple at what needs to be done. Still, I sure would like to see the Book Reader in action. Ummmm, making my own audio books for the commute. (No, no, NO! Shut up, little voice in my head with no financial sense and a high gadget lust! Shut up! Need more coffee to drown out the voice!)
  Reply With Quote
Old 01-23-2008, 10:10 AM   #28
philodox
Member
philodox doesn't litterphilodox doesn't litter
 
philodox's Avatar
 
Posts: 21
Karma: 110
Join Date: Dec 2007
Device: Bookeen CyBook Gen 3
Quote:
Originally Posted by yvanleterrible View Post
The paper was so bad that the first page actually got shreaded in the scanner, causing a paper block and necessitating a dismanteling of the device to get at the pieces.
Yikes, that is something to keep in mind then.
Quote:
Originally Posted by yvanleterrible View Post
Do you have Acrobat Pro? I didn't know it did OCR!?!
I'm actually not sure the exact version that I have, but I can check when I'm at home. It does have OCR though, that I'm sure of.
Quote:
Originally Posted by aru View Post
Don't forget the Plustek Opticbook 3600.
Never heard of it, I'll do a search and see what I find. Thanks.
Quote:
Originally Posted by slayda View Post
Check out the Scansnap S510 by Fujitsu for a little over $400.
Cool, I'll check that out.

Thanks for the info and tutorial for the Opticbook Gideon.
philodox is offline   Reply With Quote
Old 01-23-2008, 12:18 PM   #29
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Aru-
I forgot about the OCR support it came with. I always used Acrobat Reader so the only software I used was the actual scanning software. I may need to give it a go though, perhaps its better than OmniPage (And doesn't involve me hauling my stuff to someone with that program!)
Gideon is offline   Reply With Quote
Old 01-29-2008, 02:05 AM   #30
snookums
Connoisseur
snookums doesn't littersnookums doesn't litter
 
Posts: 81
Karma: 100
Join Date: Jan 2008
Device: Kindle
I hear a lot of people here saying that OCR isn't that good. I've found that OCR can be brilliant if you know what you are doing. I feel that OCR gets a bad rep because people don't realize the real magic is in the scanning.

Tip: Scan in RAW format. When you normally scan the data from the scanner is processed with your settings and excess data is discarded. RAW saves all of the data that the scanner gathered. Afterwards you can change settings and see what the result would have been if you had scanned with them. This is especially useful for the first few images where you are trying to find the ideal color balance.

Tip: Scan in Black and White and find the ideal color balance before starting. The color balance is very important. You don't want too much contrast from your scan because that will bring out speckles in the paper that will throw off the OCR software. This is counter-intuitive because you probably wanting to jack up the resolution and contrast to catch all of the detail in the book. Don't. Scan at 300 dpi and set the color or white balance so that you are only getting the text and not the texture of the page.

Tip: Make it straight. OCR software is built to handle horizontal lines of text. If there more than a moderate slant in the way that you were holding the page over the scanner, it will spit out garbled text. Some of the more expensive OCR softwares offer the ability to rotate text, but it's best just to hold the paper straight as possible when you are scanning. That can be harder than you think you are scanning a bound book.
snookums is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Digitize your own books: The Book Ripper Project anurag News 1 07-23-2009 04:22 PM
Bookshelf reduction: To digitize or not to digitize vivaldirules Lounge 15 12-06-2007 07:00 PM
how to digitize books user Workshop 13 10-05-2007 05:07 PM
Digitize a paper book in 15 minutes! spinoza Sony Reader 17 11-09-2006 12:56 PM
How to digitize a million books Bob Russell Workshop 0 03-01-2006 06:10 PM


All times are GMT -4. The time now is 02:05 AM.


MobileRead.com is a privately owned, operated and funded community.