How to fetch .DOC or .DOCX file in php

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Andrew Ballard [mailto:aballard@xxxxxxxxx]
> Sent: Friday, December 05, 2008 9:11 AM
> To: Jim Lucas
> Cc: Shawn McKenzie; php-general@xxxxxxxxxxxxx
> Subject: Re:  How to fetch .DOC or .DOCX file in php
> 
> On Thu, Dec 4, 2008 at 10:35 PM, Jim Lucas <lists@xxxxxxxxx> wrote:
> > I was going to say that I haven't yet decided on what the final
> output format is going to be.  Probably either rtf or OpenXML.
> >
> > How about I ask for suggestions on what would be the best format to
> store the final copy.
> >
> > I figured that this tool would mainly be used for .doc to web
> conversion, but I guess it could be used to also convert to other
> document formats too.
> >
> > But, I would like to have the ability to at least store the formating
> inline with the text.  So, either some form of xml.  Be it (x)HTML or
> plain XML
> > or even OpenXML.
> >
> > A question to all then.  How would you like to see the text, with
> formating, stored?
> 
> It's an excellent start. It pulled in some additional control
> characters in some of the documents I tried, and some documents had
> extra stuff at the end of the document. It was still text, but it
> looked like the text from the page header/footer definitions. It would
> be cool to see this polished and released. I just wish there was
> something this basic that worked this well on PDF files! :-)

Andrew,

There's something to be said about inter-language operability. I've become enamored with the iText package for manipulating, creating, and extracting PDF documents and associated info/bookmarks/tags/etc. There was, for a time, an OpenSource PDF editor built with JPedal/iText that looked like it would soon compete with Acrobat for PDF fillable forms; but the author has little time to play with it.

Anyway, you can setup a Java program (yes, iText is Java) to extract the text from the fields--or entire document--and spit it out however you format it (text, XML, whatev).

iText - http://www.lowagie.com/iText/ 
PHP/Java bridge - http://php-java-bridge.sourceforge.net/pjb/

HTH,


// Todd


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux