Re: what software used for ocr on linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The abbyy engine does auto rotation detection or whatever the correct term is. I found that one need to use the contrast setting of scanimage for best results.


On Fri, 11 Jul 2014, Tony Baechler wrote:

I vaguely recall Tesseract having an option for this, but it isn't
automatic.  Convert from ImageMagick should do that as well, but it isn't
automatic either.  The short answer is trial and error if memory serves.  I
remember thinking that maybe the reason for the terrible OCR is due to the
pages not being aligned and rotating the images, but I didn't get any better
results.  I haven't played with the other OCR engines.  I think FineReader
is better about this.  I'm possibly wrong here, but as I understand it, the
Windows software for the blind does the image rotation before passing it to
the OCR engine and detects the page misallignment during the scanning
process.  The Internet Archive seems to use FineReader and scans millions of
books in all kinds of conditions, so perhaps it can handle the rotation
automatically.

On 2014-07-10 06:49 AM, Sam Hartman wrote:
Is there a way to get tesseract or openocr or anything open-source to
deal with rotations?
The commercial software along with anything targeted for the blind tends
to

1) deal with 90 or 180-degree rotaions--I put the book down on the glass
in the wrong orientation

and

2) Deal with small rotations (it wasn't perfectly aligned) relatively
well.

I find these features really important when scanning things myself.
Less so when OCRing images from the web etc.

_______________________________________________
Blinux-list mailing list
Blinux-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/blinux-list


--
Have a good day,
Tony Baechler
tony@xxxxxxxxxxxx

_______________________________________________
Blinux-list mailing list
Blinux-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/blinux-list

--
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard.
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.

This message has been scanned for viruses and dangerous content by MailScanner,
and is believed to be clean.

Please consider the environment before printing this email.



_______________________________________________
Blinux-list mailing list
Blinux-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/blinux-list




[Index of Archives]     [Linux Speakup]     [Fedora]     [Linux Kernel]     [Yosemite News]     [Big List of Linux Books]