I am sure tiff is supported. It is really strange. I get what look like words and what I get is the same every time I do a scan of the same image but they are nonsense. I even tried adding the designation for English thinking somehow it wasn't using English but got the same results. I know the image file is okay because it comes out fine using ABBY FineReader Express on my Mac. -- Cheryl May the words of my mouth and the meditation of my heart be acceptable to You, Lord, my rock and my Redeemer. (Psalm 19:14 HCSB) > On Nov 2, 2015, at 10:15 PM, Tom Fowle <wa6ivgtf@xxxxxxxxxxx> wrote: > > Sheryl, > I arbitrarilly chose to convert the pdf to jpeg as tesseract doesn't do > pdf. > > Then I just did > tesseract filename.jpg outfile > produces > outfile.txt > > sorry havn't tried .tif and I couldn't find a list of supported file types. > > tom fowle > > On Mon, Nov 02, 2015 at 02:53:45PM -0600, Cheryl Homiak wrote: >> Would you mind enlarging on this if you can and have time? What kind of file did you use and what did you put in your command-line? I am asking this because I have tried to use tesseract a couple of times with tiff files and have gotten mostly gibberish so obviously I am doing something wrong. I am running debian testing if that makes a difference. >> >> Thanks. >> >> -- >> Cheryl >> >> May the words of my mouth >> and the meditation of my heart >> be acceptable to You, Lord, >> my rock and my Redeemer. >> (Psalm 19:14 HCSB) >> >> >> >> >> >>> On Nov 2, 2015, at 2:13 PM, John G Heim <jheim@xxxxxxxxxxxxx> wrote: >>> >>> >>> I've been scanning in the D&D 5th Edition player's handbook. I tried every open source OCR program I could find and tesseract was easily the best. On pages that are just prose, it probably does about 99% accuracy. Even on pages where that are 2 columns of prose, it does really well if you tell it to look for that. Somebody sent me a pdf of the same book done with a professional OCR program for Windows. The results are approximately equal. Tesseract may lack the bells & whistles of commercial products but for accuracy, it's pretty good. >>> >>> >>> >>> On 11/01/2015 11:24 PM, Tom Fowle wrote: >>>> Am I the last to find this? >>>> command line ocr tesseract >>>> won't directly support .pdf but >>>> pdftocairo >>>> produces .jpg among others which tesseract will read. >>>> >>>> May not do well with collumns but not too bad. >>>> >>>> Is there anything better? >>>> >>>> Thanks >>>> tom Fowle >>>> _______________________________________________ >>>> Speakup mailing list >>>> Speakup@xxxxxxxxxxxxxxxxx >>>> http://linux-speakup.org/cgi-bin/mailman/listinfo/speakup >>>> >>> >>> -- >>> John Heim, jheim@xxxxxxxxxxxxx, 608-263-4189, skype:john.g.heim, sip:jheim@xxxxxxxxxxxxxxxx >>> _______________________________________________ >>> Speakup mailing list >>> Speakup@xxxxxxxxxxxxxxxxx >>> http://linux-speakup.org/cgi-bin/mailman/listinfo/speakup >> >> _______________________________________________ >> Speakup mailing list >> Speakup@xxxxxxxxxxxxxxxxx >> http://linux-speakup.org/cgi-bin/mailman/listinfo/speakup > _______________________________________________ > Speakup mailing list > Speakup@xxxxxxxxxxxxxxxxx > http://linux-speakup.org/cgi-bin/mailman/listinfo/speakup _______________________________________________ Speakup mailing list Speakup@xxxxxxxxxxxxxxxxx http://linux-speakup.org/cgi-bin/mailman/listinfo/speakup