Re: Re: Reading a Word document from PHP

Ashley Sheridan <ash@xxxxxxxxxxxxxxxxxxxx> · Tue, 23 Sep 2008 10:29:49 +0100



On Tue, 2008-09-23 at 10:55 +0200, Grzegorz Kurtyka wrote:
> Ashley Sheridan wrote:
> 
> > Hi All,
> > 
> > I recently asked a question regarding reading a PDF with PHP. I've tried
> > Zend_pdf, but all this is able to give me is the number of pages in a
> > PDF, and cannot extract the text from the PDF files I have. I thought
> > I'd try a different method, and try to extract the text straight from
> > the Word document which is used to generate the PDF. Does anyone have
> > any experience with this sort of thing, or enough to suggest a library
> > which is capable of this?
> > 
> > 
> > Ash
> > www.ashleysheridan.co.uk
> > 
> It might be worth lokking into apps like catdoc
> http://freshmeat.net/projects/catdoc/ its able to extract texts from
> doc/ppt/xls files (similar to "strings" command but takes care of files
> internal encoding for you). Reading README in this package might give you
> some tips
> 
> 
I'd actually considered doing along the lines of this myself, as the
Word docs were only simple affairs, and I only wanted the text from them
anyway.


Ash
www.ashleysheridan.co.uk


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php