On Tue, 2008-09-23 at 08:51 +0200, clive wrote: > Hi I know the new microsoft docx format is an xml document, so you could > probably use the xml parser with that. > > Any chance you can get them to use a rtf file instead of a word file to > convert to pdf, rtf is mostly readable text with some control words > thrown in for formatting. > > clive > > Ashley Sheridan wrote: > > Hi All, > > > > I recently asked a question regarding reading a PDF with PHP. I've tried > > Zend_pdf, but all this is able to give me is the number of pages in a > > PDF, and cannot extract the text from the PDF files I have. I thought > > I'd try a different method, and try to extract the text straight from > > the Word document which is used to generate the PDF. Does anyone have > > any experience with this sort of thing, or enough to suggest a library > > which is capable of this? > > > > > > Ash > > www.ashleysheridan.co.uk > > > > > > > > > No worries about the tip, it was a good tip. Unfortunately I'm stuck trying to extract text from a Word document or a PDF file because she doesn't know how to make a CSV in Excel, despite me showing her how to do it. She kept trying to upload a PDF to the site and wondered why it wasn't able to pick out the text from that! Obviously an attempt was made to shift the blame to me for not building the site right! Ash www.ashleysheridan.co.uk -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php