On Mon, Jun 30, 2008 at 4:43 PM, Craig White <craigwhite@xxxxxxxxxxx> wrote: >> >>>>> Is there an F8 application that will convert a .png copy of a text list >> >>>>> to a text file? >> >>>> >> >>>> ---- >> >>>> png is a picture file and there is no text. >> >>>> >> >>>> If you want OCR (optical character recognition - software that scans a >> >>>> picture for recognizable text and saves the recognized text to a file), >> >>>> I would suggest tesseract. >> >>> >> >>> Thanks, I will look at that. >> >>> >> >> >> >> I believe that Tesseract only understands TIF files, so you will need >> >> to convert the png before you can OCR them. >> >> >> >> >> > >> > Yes, I discovered that requirement but now I am stumped by - >> > >> > The command line is: >> > tesseract <image.tif> <output> [-l langid] >> > >> > I thought "-l enUS" might work but no go there. >> > >> > There's no man page, only a README and that doesn't tell me about the langid >> > other than it wants it. Without it I get very strange looking text. >> >> Unfortunately, the OCR programs working in Linux are not very good >> yet. In case you have access to Acrobat Professional, use it instead; >> the results are usually excellent. > ---- > I've never used Acrobat Professional for OCR but I have gotten excellent > results from tesseract on Linux. > > OP should check out... > > http://www.groklaw.net/article.php?story=20061210115516438&query=tesseract > > http://www.linuxjournal.com/article/9676 I have used both (Acrobat Professional and Tesseract) on the same documents. Regarding results, Acrobat Professional beats Tesseract by far. Paul -- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list