Re: Last Call: 'The APPLICATION/MBOX Media-Type' to Proposed Standard

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 12 Aug 2004 17:18:19 EDT, Tony Hansen said:
> The information about the mbox format being anecdotally defined is 
> incorrect. The mbox format has traditionally been documented in the 
> binmail(1) or mail.local(8) man pages (BSD UNIX derivatives) or mail(1) 
> man page (UNIX System 3/5/III/V derivatives). There have been several 
> variants of the mbox format in use by those different systems. The most 
> complete description of an mbox format can be seen in the man page from 
> any UNIX System Vr4 derived system, such as Solaris.

Umm.. Tony?  I hate to say it, but if there have been several variants used in
the wild, and the man pages for said variants document different formats,
that's awfully close to "anecdotally defined" when you're doing a standard.

For example, a Solaris 8 box across the hall says in 'man mail.local':

     Each delivered mail message in the mailbox is preceded by  a
     "Unix From line" with the following format:

          From sender_address time_stamp

     The sender_address  is  extracted  from  the  SMTP  envelope
     address  (the  envelope  address  is  specified  with the -f
     option).

     A trailing blank line is also added to the end of each  mes-
     sage.

Hmm. Nothing about whether the sender_address is, or should be, <bracketed>.
Nothing about the format of the time_stamp. Nothing about '>From ' stuffing
(and yes, I've seen systems that don't do it at all, and systems that only
>-stuff if the From line matched a regexp for what *they* think the entire 'From '
line looks like(*)). The Sendmail 8.13.1 mail.local does say >-stuffing
happens for lines that "which could be mistaken for a ``From '' delimiter
line", and the code actually checks for exactly 5 chars...

Any doubts that this whole mess is at best anecdotally defined can be dispelled by
mentioning "Content-Length:" (interestingly enough, not even mentioned in the
Solaris or Sendmail man pages, although the Sendmail source tree does mention
that building on Solaris 2.3 or later will turn it on.  Of interest mostly because
the Content-Length: is so easily broken by later >-stuffing/unstuffing or other
similar conversion...

(*) time_stamp. Argh.  Fought with this during a data/machine migration.
Write code that will accept a 26 byte ctime format: 'Fri Sep 13 00:00:00 1986\n\0'.
Works fine once you realize that some systems just used 'From envelop_address'
without a timestamp.

Then I get handed this: 'Fri Aug 13 20:21:32 EDT 2004'.  Fix that, and find some
joker running in a French locale: 'vendredi, 13 août 2004, 20:22:01 EDT'.
And yes, his b0rked software only >-stuffed 'From ' lines that regexp-matched
the *French* variant. Took me *quite* some time to twig into THAT one...

Attachment: pgpQD3YMg3LFi.pgp
Description: PGP signature

_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]