Re: 'View as HTML' Conversions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kevin wrote:
> Hi,
> 
> We are trying to create a script which (the same as google search and
> gmail) allows for PDF's, Doc's, Excel etc to be converted to HTML
> documents dynamically, this is just in case users want to view documents
> but don't have the necessary software. The HTML needs to keep as much of
> the styling as possible.
> 
> Does anyone know how google have done this? or does anyone know any PHP
> equivalents, we are using the Windows IIS server.

Have a look at the *nix command line utility: pdf2html it may give you
some guidence but seeing as you are using Windows I don't know. I'm
pretty sure Google will used *nix and perhaps even the pdf2html tool
itself??

As for word documents I know there are *nix command line tools to
extract text out of them for indexing but not sure about format
conversions, certainly it should be easy enough to produce a PDF form
Word/Excel then use whatever technique you choose to get the PDF2HTML.

If your webserver has Word/Excel installed (not recommended for security
etc.) then I presume there is some sort of method you can call to
interact with it.... Tho' guessing that word will not take command line
args like any sensible program can and do automated conversion for you... ;)

Not much help here I know, but hopefully some food for thought.

Col

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux