Re: PHP class or functions to manipulate PDF metadata?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Peter Ford wrote:

> O. Lavell wrote:

[..]

>> Any and all suggestions are welcome. Thank you in advance.
>> 
> So many people ask about manipulating, editing and generally processing
> PDF files. In my experience, PDF is a write-once format - any
> manipulation should have been done in whatever source generated the PDF.
> I think of a PDF as being a piece of paper: if you want to change the
> content of a piece of paper it is usually best to chuck it away and
> start again...
> 
> Even more so, this would apply to the PDF metadata: metadata is supposed
> to describe the nature of the document: it's author, creation time etc.
> That sort of data should be maintained with the document and ideally not
> changed throughout the document's lifetime (like the footer, or
> end-papers in a physical book)

Thank you very much for your reply. And it's not that I don't agree with 
you. Because I do, completely.

However...

PDFs often come from sources that can't be bothered to fill in the 
relevant fields correctly, completely, or at all. For those cases I would 
like the users of my application to be able to correct the values found 
in the metadata. Upload the PDF, get a nice little HTML form with 4 or 5 
values to review or edit. That sort of thing.

> I do accept that the metadata should be machine-readable: that part of
> your project is reasonable and I'm fairly sure that ought to be possible
> with something simple. The best bet I found so far is PDFTK
> (http://www.pdfhacks.com/pdftk/) which is a command-line tool that you
> could presumably call with exec or whatever...

Like I said, this is what I am already doing with the pdfinfo utility 
from xpdf.

But now that you mentioned pdftk... I just tried it and it does seem to 
come close to what I want. It is capable of writing a new PDF with the 
contents of an existing one, with new metadata fed as a text file. So it 
shouldn't be very hard to write a little PHP around that process.

Now I need to think a bit more about this approach. Perhaps it can be 
implemented using only pure PHP, after all. But for the time being, pdftk 
will do.

So thank you again for pushing me in that direction, even if 
unintentionally and despite the fact that what I am doing goes against 
your judgement ;)


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux