Re: formatting a word doc using php ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17 September 2010 21:41, Vinay Kannan <vinykan@xxxxxxxxx> wrote:
> Hello Xperts,
>
> I am trying out a couple of things and have come across this requirement,
> where in a php script should create a doc file with specific formatting, I
> can create the file, havent tried it yet, but shouldnt be a problem I guess,
> but how do I format the doc file is the question, google search didnt give
> good results too :( So not even sure if it can be done actually.
>
> Thanks,
> Vinay
>

Depending upon the version of document you want to create you have 3
mechanisms available.

1 : To use the older style doc format which is a proprietary binary
format controlled by Microsoft and has been reversed engineered (to a
degree) by various developers and you can find many instances of
php/word tools.
2 : To use the newer style xml format which is fully human readable.
3 : Use PHP's COM interface to interact with an instance of MS Word.

Each has its merits and really depends upon how complex you want to go.

For example, option 3 allows you to do everything that you can do
manually. If you can record a macro doing it in Word, then you can
script it in PHP. Learning a little bit about TypeLibraries and
reading the VBA documentation for Word will certainly help you there.
The main downside here is you need to have a license for Word and to
be using Windows. I use PHP's COM for interacting with Crystal Reports
Developer Edition. I've not built a report from scratch using PHP, but
that is available to me. Just like it is with MS Word. Using a
template, you can easily use PHP's COM to talk to Word, create a new
document from the template, do search and replace of bookmarks with
text and finally save the document.

Below is an example of reading a word count from a document called
Z:\5words.doc.

<?php
ini_set('com.autoregister_casesensitive', 1); // Optional. When set
wdPropertyWords does NOT equal WDPROPERTYWORDS
ini_set('com.autoregister_typelib', 1); // Auto registry the loaded
typelibrary - allows access to constants.
ini_set('com.autoregister_verbose', 0); // Suppress Warning:
com::com(): Type library constant emptyenum is already defined in $s
on line %d messages.

$o_Word = new COM('Word.Application') or die('Cannot load MS Word');
$o_Word->Visible = 0;

$o_Doc = $o_Word->Documents->Open('z:/5words.doc');
echo 'There are ', $o_Doc->BuiltInDocumentProperties(wdPropertyWords),
' word(s) in this document.';
$o_Doc->Close(False);
$o_Word->Quit();
unset($o_Doc);
unset($o_Word);
?>

Using Option 2 means learning how MS have defined their document.
Styling, etc. isn't like it is in old style HTML (<b>bold</b>), but
more like (but not exactly) using CSS tags (<p class="bold">bold</p>).


Using Option 1 means you will be limited to whatever has been reverse
engineered. And when the binary format changes (though less likely now
due to the XML route), then you'd have to be waiting on the developer
to fix the code first.


If XML is in your capability, then I'd go with that at a first
attempt, then the third party class and finally the COM.

Richard.

-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux