On Mon, 16 Aug 2004 22:47:52 EDT, Tony Hansen said: > The claim in Appendix A is that there were no authoritative sources of > documentation for the mbox formats and otherwise it's "only documented > in anecdotal form". I'm sorry, but the the definitions ARE there, and > ARE almost always authoritative for those systems. Somehow, I can't get thrilled by the concept of saying a format is documented because we have (for example) 3 systems, and each has an authoritative definition of the version it uses, and the definitions are incompatible (and yes, the Solaris 'content-length:' scheme and '>from ' escaping are basically incompatible - there exist messages that can't be converted from one to the other without information loss). > Because Solaris 8 is System Vr4-derived, you should look at 'man mail' > for the definitive definition. You'll find Content-Length: documented there. It says: A letter is composed of some header lines followed by a blank line followed by the message content. The header lines section of the letter consists of one or more UNIX post- marks: From sender date_and_time [remote from remote_system_name] followed by one or more standardized message header lines of the form: keyword-name: [printable text] where keyword-name is comprised of any printable, non- whitespace characters other than colon (`:'). A Content- Length: header line, indicating the number of bytes in the message content will always be present unless the letter consists of only header lines with no message content. For bonus points - is the 'crlf-crlf' between the header and the body included in the Content-Length:? There's other issues as well - what if the Content-Length: is computed across a non-canonified message - how do you send it across the wire? 'man mail' doesn't mention escaping a 'From ' inside a message, except for this: The default mode for printing messages is to display only those header lines of immediate interest. These include, but are not limited to, the UNIX From and >From postmarks, From:, Date:, Subject:, and Content-Length: header lines, and any recipient header lines such as To:, Cc:, Bcc:, and so forth. After the header lines have been displayed, mail Of course, that's because Solaris doesn't use '>From ' escaping because it has Content-Length instead. Should other systems trust the value of a Content-Length:? Should other systems be required to include a Content-Length? Should other systems escape a 'From ' iff there's no Content-Length? What if an mbox file has a Content-Length on some items but not others? How do you recover from a corrupted Content-Length? So - where is the *one true canonical* definition of an mbox that actually answers all these basic questions that an implementer *needs* to know the answer to?
Attachment:
pgpqdnXgDVDSE.pgp
Description: PGP signature
_______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf