Re: AW: Read Text Content from PDF file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



haliphax wrote:
On Mon, Mar 30, 2009 at 3:16 AM, Michael A. Peters <mpeters@xxxxxxx> wrote:
Andrei Bintintan wrote:
Any other ideas?


Also - I believe the commercial (full) version of PDFlib can - in fact, I
believe you can do it with the pecl-pdflib library if you have the full
version of pdflib installed.

I seem to remember there work on a clibpdf php wrapper back when php 4 was
new, I don't know if it ever amounted to anything.

It may be worth your time to look into Adobe's IFilters. I know
they've got class libraries for C# and other .NET languages--those COM
DLLs could either be leveraged for PHP or there may be a "native" PHP
implementation.

Not in the mood to Google,


Just a note -

It looks overpriced for me but may be of value to corporate users, pdflib has a program called tet that can eat a pdf file and spit out an xml file with all kinds of groovy info about the pdf in it.

Wish I had a spare grand just to play with it (since I'm Linux only, even at home, the server license is only possibility for me).

It looks like tet would be superior way to grab info from pdf files for database/etc storage.

I wonder if xpdf/poppler-utils could be extended to do something similar under a FOSS license. But that's off topic for this list.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux