This is my (personal) response to comments on draft-hardy-pdf-mime -02 and -03. I haven't had a chance to review all of these with all of the co-authors. There is a new -04 draft now. I'll send individual responses as requested with references to this discussion; I’m trying to reply to reviews by: S Moonesamy, Phillip Hallam-Baker, John Klensin, Kathleen Moriarty, Stephen Farrell, Dan Romascanu, Mirja Kuehlewind, Ben Campbell ======== History: I volunteered to help with draft-hansen-rfc-use-of-pdf; the history history of PDF wasn't right, and the registration in 3778 did need to reflect the ‘change controller’. This draft was started in the same github repo, until https://github.com/masinter/pdfrfc/commit/a7208154d541613ba66f59aab6e1507754fe26e4 I talked the ISO committee working on PDF 2 to take on owning the "fragment identifier" semantics; it wasn't part of the PDF 1.7 definition that got fast tracked, but it belonged with the PDF spec. (In general, those defining file formats and registering them need to be prodded into owning the "fragment identifier semantics".) I don't know of any other instance of an ISO committee defining a media type. The ISO committee wants to put the RFC number into the PDF 2 spec, while this spec wants to make normative reference to the (likely-to-be-approved-in-2017) PDF-2 spec. Note that application/pdf was FIRST registered in 1993 (23 years ago) by Paul Lindner for use in the gopher protocol. I was one of the GopherCon '93 attendees to urge him to do so (and TimBL to use content-type in HTTP/1.0). "application/pdf" was chosen before the introduction of the vnd. prefix. When the "standards tree" and "vendor tree" distinction was introduced, application/pdf was grandfathered rather than forced to application/vnd.adobe.pdf. It's only relatively recently that it now qualifies for "standards tree". Just to update the media type registry might not even be necessary. (The text/html registration was updated without an RFC to obsolete RFC 2854.) Does any of this history belong in the document? I didn't think so. ====== Editorial: The paragraph numbers are a feature of xml2rfc. I turned them off. Yes, the Introduction should say "It obsoletes [RFC3778]." and RFC3778 added to Informative references. ====== Interoperability considerations: I put in a reference to ISO 32000-1 Annex I "PDF Versions and Compatibility" talks about the use of version numbers and backward compatibility. http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=735 There's a lengthier blog post/paper by Jim King: http://blogs.adobe.com/insidepdf/2009/08/pdf_evolution_and_compatibilit.html http://blogs.adobe.com/insidepdf/Compatibility_090819.pdf ... but it's older, Jim has retired and unlikely to update it, I don't know if the URL is stable (it's a blog, not an official document). Is it worth referencing? I thought not. ====== Security considerations: There were a lot of comments about Security Considerations. It's true that lots of RFC 3778's security considerations got replaced, here's the commit: https://github.com/mrbhardy/pdfmime/commit/76904f445bd35a472f759bc45a25d24f695d40f0 As I understood it, the feeling was that the previous text was wrong or misleading, that these days many formats allow scripting, and the techniques and reasons for sandboxing well known. Version -04 adds to the end of “Security Considerations”: PDF interpreters executing any scripts or programs related to these constructs must be extremely careful to insure that untrusted software is executed in a protected environment. PDF has been around a long time; a search for "PDF malware" turns up lots of hits. I too wish there were an ISO or other document that could be cited, but I haven't found one. Is it necessary to say more, in the MIME type registration? It might help to clarify who "Security Considerations" in MIME registrations are mainly for. Don't think of developers of PDF viewers/interpreters, think of system administrators, makers of firewalls, proxies. People who don't really care about PDF, who won't study the spec, just want to know what they should watch for, beyond the usual for every file type. Having a target audience might help. ======== (PH-B) MIME types should identify content which has scripts/macros: I wasn't sure if you meant the type itself, or that content should be labeled. I'm not sure how this applies to the application/pdf registration. In general, a label "I have no scripts" isn't helpul because bad guys can lie [RFC 3514]. You can refuse to run scripts, or run them with limited access. ===== Signatures add trust: > PDF also has a signature capability which is relevant. If the Macros are > signed by a trustworthy party, they are less of a concern than random > Macros. Is this true? malware that reproduces itself can't be signed and transmitted. ======= Subsets without scripting? > ... some of the subsets do not allow > embedded scripting. If that is correct, it should certainly be > mentioned. I mentioned it for PDF/A in passing, although it is not clear how this helps. A file could lie and say it was PDF/A but still have scripts. ================ Adobe PDF vs ISO specs > how the current ISO version of PDF compares to the Adobe version ISO 32000-1 was adopted using ISO Fast-Track process and is technically identical to Adobe Portable Document Format version 1.7. The Fast-Track process doesn't allow any technical changes; it was just rewritten in ISO-spec style. > If the difference is significant, then a new > media type, not reuse of an old one, is required. Even if it is > not that significant, it appears to me (as a co-author of RFC > 6838) that there is a strong case to be made for parameters that > identify versions and/or specific subsets to help applications > to identify viewers or processors that will not fail. The > authors may have good reasons to not include either parameter, > but it seems to me that the I-D should then explain why not. There are no technical differences between Adobe PDF 1.7 and ISO 32000-1:2008. I don't see a case for version or subset parameters in content-type headers here or most anywhere that backward and forward compatibility has been carefully planned. As with most living file formats, if you want to consume files using the latest features in their fullest, you want to have an up-to-date viewer; publishers can target earlier versions for wider applicability and reliability. I think that’s it. Thanks all for your reviews, Larry -- http://larry.masinter.net