Hi,
Have you experimented with the pdftotext -layout and -raw options? I've
noticed that usually no command line options produces usable output but
sometimes using the -raw command line optin works better. Then again,
sometimes it makes it worse. There's generally a decrease in the output
file size when using the -raw option. Also, since presumably the
interest of most people here is in making pdf documents accessible or
somehow finding a way to use the text from them, do you know of a way to
get around the protection bits? I'm not trying to pirate or anything
else, but I know of at least one major audio editing package which ships
manuals that can't be read with pdftotext because of the no-print and
no-copy bits. I know the text is there in the pdf file because they
sent me an unprotected copy upon request, but they have been bought out
by a major media company now.
Geoff Shang wrote:
I have both pstotext and pdftotext installed here. Results seem to
vary as to which is better and you may want to try both if a document
is proving difficult to read and see which gives the best results.
_______________________________________________
Blinux-list mailing list
Blinux-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/blinux-list