Anyone able to OCR a PDF file?

mwhapples@xxxxxxx (Michael Whapples) · Tue, 3 Jan 2012 17:38:12 -0000

I have personally used cuneiform for linux mostly. I cannot remmeber if it 
can natively manage PDF files (possibly, certainly it can do more than 
TIFF), however you could use a conversion tool (memory seems to say 
pdf2tiff).

Michael Whapples

-----Original Message----- 
From: Janina Sajka
Sent: Tuesday, January 03, 2012 4:40 PM
To: speakup at braille.uwo.ca
Subject: Anyone able to OCR a PDF file?

Has anyone figured out how to get one of the Linux OCR engines (like
tesseract) to accept a graphical file (other than .tiff) as input? In
particular I'm going to be swamped with graphical PDF files this year.
Printing these just to scan them seems both wasteful and inefficient.

I know people do this on other OS's. Has anyone suggestions on how to do
this in Linux?

All suggestions greatly appreciated.

Janina

-- 

Janina Sajka, Phone: +1.443.300.2200
sip:janina at asterisk.rednote.net

Chair, Open Accessibility janina at a11y.org
Linux Foundation http://a11y.org

Chair, Protocols & Formats
Web Accessibility Initiative http://www.w3.org/wai/pf
World Wide Web Consortium (W3C)