12-02-2021, 08:06 AM | #1 |
the rook, bossing Never.
Posts: 12,247
Karma: 89531599
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
OCR after scan update
|
12-02-2021, 12:27 PM | #2 |
Wizard
Posts: 1,582
Karma: 7043711
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
|
Many thanks for the info! By chance, do you know about a good Windows GUI for this new release of Tesseract?
|
12-02-2021, 03:35 PM | #3 |
the rook, bossing Never.
Posts: 12,247
Karma: 89531599
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Sorry, I don't remember. I ditched Windows 100% for Linux Mint + Mate Desktop in Jan 2017 after nearly 25 years, though I'd been using Linux on servers and dual boot since 1998 (Red Hat, Suse, CentOS, Debian, Ubuntu, DSL etc).
|
12-02-2021, 06:31 PM | #4 | |
Grand Sorcerer
Posts: 5,421
Karma: 99236514
Join Date: Apr 2011
Device: pb360
|
Quote:
This was the first hit for a web search on "tesseract ocr windows gui". I have no idea whether any of them are any good. The bove list is not limited to microsoft platforms. |
|
12-03-2021, 06:34 AM | #5 | |
Wizard
Posts: 1,582
Karma: 7043711
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
|
Quote:
|
|
12-03-2021, 12:20 PM | #6 | |
Grand Sorcerer
Posts: 5,421
Karma: 99236514
Join Date: Apr 2011
Device: pb360
|
Quote:
https://github.com/manisandro/gImageReader/issues/285 There is a pending pull request that supposedly fixes the above, but it looks like it won't be merged. https://github.com/manisandro/gImageReader/pull/286 The below links to how to build Tesseract by the gImageReader author, but the links are dead. https://github.com/manisandro/gImageReader/issues/357 This is all very strange since people having the problem say Tesseact from the command line is not slow and the gImageReader author says it's not a gImageReader problem. This is all T V3 _> T V4. |
|
12-04-2021, 05:55 AM | #7 | |
Wizard
Posts: 1,582
Karma: 7043711
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
|
Quote:
1. I downloaded and installed this GUI: https://github.com/Parathantl/tesseract_gui/releases (It installs Tesseract 4 but is easy to replace V4 with V5). 2. That GUI is to OCR pdf files. 3. I OCRed a pdf with 25 pages and I noted the time to finish the task. 4. I repeated the job but in console mode. Results were practically the same. 5. After my tests, I can say that ABBy is -at least- twice faster than Tesseract while the accuracy is almost the same. Finally, I think I discover the cause of the difference of speed; Tesseract is using ONLY ONE CPU. I don't know how was compiled the .exe (for 64bits) but is not multithreading or the user doesn't have the option to enable it (maybe under Linux things are different). A real pity because is a nice program with a very good OCR precision and free. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Scan for duplicates | lbutlr | Library Management | 18 | 04-14-2019 08:01 AM |
Unable to Scan | hroberts89436 | Calibre Companion | 4 | 12-11-2016 02:04 PM |
Is barebones commercial scan/ocr to PDF file adequately converted by Send-To-Kindle ? | scanewbie | Workshop | 4 | 07-20-2015 05:54 PM |
How to convert an OCR file to a Non-OCR one | res9282 | 1 | 08-05-2011 05:58 AM | |
scan to eBook | Red Alert | Sony Reader | 9 | 07-29-2007 03:21 AM |