Re: PDF to Text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jay Blanchard wrote:
[snip]
I am trying to find a way for a program to search through the text on
a
PDF. My first thought was to use pdftotext, but the PDFs generated by
our
commercial scanner/copier/printer machine do not seem to work with
pdftotext... it just outputs two CRLFs.  I've been looking around on
the
net for something similar that might work.

Anyone know of something like that?

Thanks,
--
Ray Hauge

Things I forgot to post:

It is a PHP script.  I was planning on using shell_exec() to call the
program and read the output from stdout.
[/snip]

Sounds like the PDF's are images and therefore will not be readable by
anything, save for eyeballs. I have run into this quite a bit. The
scanner scans the doc via a TWAIN driver, which then converts the info
into an image of that which was scanned. It would be like trying to read
text programmatically from a JPEG.....not really possible.


http://www.cs.wisc.edu/~ghost/  will do it.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux