O. Lavell wrote: > Hi group, > > I am looking for an easy way to manipulate (read, write) the metadata > (title, subject, keywords, author) in PDF files through PHP. > > Most PHP/PDF solutions I have found so far (through Google) are aimed at > constructing PDFs from text and graphics, with lots of fancy features, > but most of them omit metadata functions altogether. > > I would also prefer something extremely lightweight that I could just > include_once() into my script, i.e. not a module or external program. I > am currently using pdfinfo from xpdf-utils, but it has to go. > > My use case is I want to build a database with the metadata of a bunch > (many hundreds, perhaps thousands) of PDF files in a directory on the > server for easy search, statistics and retrieval. I also want users to be > able to make edits to any PDF's metadata from the web. > > If it can be at all avoided, I would rather not have to invent the wheel > myself here. I have looked at the Adobe PDF specification a bit and it > looks quite... challenging. Or should I say daunting. > > Any and all suggestions are welcome. Thank you in advance. > So many people ask about manipulating, editing and generally processing PDF files. In my experience, PDF is a write-once format - any manipulation should have been done in whatever source generated the PDF. I think of a PDF as being a piece of paper: if you want to change the content of a piece of paper it is usually best to chuck it away and start again... Even more so, this would apply to the PDF metadata: metadata is supposed to describe the nature of the document: it's author, creation time etc. That sort of data should be maintained with the document and ideally not changed throughout the document's lifetime (like the footer, or end-papers in a physical book) I do accept that the metadata should be machine-readable: that part of your project is reasonable and I'm fairly sure that ought to be possible with something simple. The best bet I found so far is PDFTK (http://www.pdfhacks.com/pdftk/) which is a command-line tool that you could presumably call with exec or whatever... -- Peter Ford phone: 01580 893333 Developer fax: 01580 893399 Justcroft International Ltd., Staplehurst, Kent -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php