Re: extracting text from png files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Howdy,

i use tesseract for doing this.
I recognized with version 4.0 what just is released the results improved a lot here (for german and english usecases).
some offical numbers could be found here:
https://github.com/tesseract-ocr/docs/raw/master/das_tutorial2016/7Building%20a%20Multi-Lingual%20OCR%20Engine.pdf
the languages improves between 10 and 80 percent - depending on language and it previouse support level..
It seems it got a new OCR engine spend based on neuronal network.

cheers chrys

Am 17.12.18 um 16:57 schrieb Linux for blind general discussion:
Disclaimer: I don't know which image formats either program supports
directly, nor do I know of a good way to convert between image
formats, though I'm pretty sure cuneiform supports at least .jpg and
.png files directly.

I also remember at least one OCR tutorial recommending some
preprocessing to make images easier for the OCR program to work with,
and I believe they used the convert command provided by imagemagick to
do so, but I forget the details.

Also, it's been a while since I've attempted any OCR'ing myself(how
often I had to manually clean up the output kind of put me off), so
there might be others on this list who can provide better, and more
specific advice on this subject.

Still, I hope I've at least got you started on the right track.




_______________________________________________
Blinux-list mailing list
Blinux-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/blinux-list



[Index of Archives]     [Linux Speakup]     [Fedora]     [Linux Kernel]     [Yosemite News]     [Big List of Linux Books]