Kevin wrote: > Hi, > > We are trying to create a script which (the same as google search and > gmail) allows for PDF's, Doc's, Excel etc to be converted to HTML > documents dynamically, this is just in case users want to view documents > but don't have the necessary software. The HTML needs to keep as much of > the styling as possible. > > Does anyone know how google have done this? or does anyone know any PHP > equivalents, we are using the Windows IIS server. Have a look at the *nix command line utility: pdf2html it may give you some guidence but seeing as you are using Windows I don't know. I'm pretty sure Google will used *nix and perhaps even the pdf2html tool itself?? As for word documents I know there are *nix command line tools to extract text out of them for indexing but not sure about format conversions, certainly it should be easy enough to produce a PDF form Word/Excel then use whatever technique you choose to get the PDF2HTML. If your webserver has Word/Excel installed (not recommended for security etc.) then I presume there is some sort of method you can call to interact with it.... Tho' guessing that word will not take command line args like any sensible program can and do automated conversion for you... ;) Not much help here I know, but hopefully some food for thought. Col -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php