RE: Reading a Word document from PHP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Bastien Koert [mailto:phpster@xxxxxxxxx]
> Sent: Monday, September 22, 2008 4:00 PM
> To: ash@xxxxxxxxxxxxxxxxxxxx
> Cc: php-general
> Subject: Re:  Reading a Word document from PHP
> 
> On Mon, Sep 22, 2008 at 4:58 PM, Ashley Sheridan
> <ash@xxxxxxxxxxxxxxxxxxxx>wrote:
> 
> > On Mon, 2008-09-22 at 16:47 -0400, Bastien Koert wrote:
> > >
> > >
> > > On Mon, Sep 22, 2008 at 3:56 PM, Ashley Sheridan
> > > <ash@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >         Hi All,
> > >
> > >         I recently asked a question regarding reading a PDF with
> PHP.
> > >         I've tried
> > >         Zend_pdf, but all this is able to give me is the number of
> > >         pages in a
> > >         PDF, and cannot extract the text from the PDF files I
have.
> I
> > >         thought
> > >         I'd try a different method, and try to extract the text
> > >         straight from
> > >         the Word document which is used to generate the PDF. Does
> > >         anyone have
> > >         any experience with this sort of thing, or enough to
> suggest a
> > >         library
> > >         which is capable of this?
> > >
> > >
> > >         Ash
> > >         www.ashleysheridan.co.uk
> > >
> > >
> > >         --
> > >         PHP General Mailing List (http://www.php.net/)
> > >         To unsubscribe, visit: http://www.php.net/unsub.php
> > >
> > >
> > >
> > > This may help
> > >
> > > http://drewd.com/2007/01/25/reading-from-a-word-document-with-com-
> in-php
> > >
> > > http://www.phpclasses.org/browse/package/388.html
> > >
> > >
> >
http://www.developertutorials.com/blog/php/extracting-text-from-word-
> documents-via-php-and-com-81/
> > >
> > >
> > > --
> > >
> > > Bastien
> > >
> > > Cat, the other other white meat
> > >
> > Unfortunately all of those use COM, which is only available on
> > Windows... I'm guessing this isn't possible on a proper OS?
> >
> >
> > Ash
> > www.ashleysheridan.co.uk
> >
> >
> Sadly, no...the closest you could come to, would be to try and see if
> you
> can manipulate OpenOffice via exec or something like that to read the
> document and do what you need to

There's a Python script described in this article:

http://www.gsdesign.ro/blog/php-convert-microsoft-word-doc-to-pdf/

...that sounds like it will do what you want. Sucks that it's using
Python, but at least it's a technology that isn't hard to put on a Linux
machine. Other than that, I would recommend Mono, perhaps, and use a
.NET DLL (or similar construct).

HTH,


Todd Boyd
Web Programmer




-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux