--On Tuesday, 17 August, 2004 15:09 -0400 "Eric A. Hall" <ehall@xxxxxxxxx> wrote: >> To be clear about this, I think there are three choices which >> we might prefer in descending order: >> >> (1) There is a single canonical "wire" format in which >> these things are transmitted. > > Such a specification would surely dictate "a series of > message/rfc822 objects". But if we were to require that > end-points perform conversion into a neutral form, we might as > well go the whole nickel and just say "use multipart/digest", > because that's where we'd end up after monhts of beating on > each other. >... >> (2) The content-type specifies a conceptual form >> ("application/mbox") but has _required_ parameters that >> specify the specific form being transmitted. > > Global parameters are useless if the parser is intelligent > enough to figure out the message structure independently. > Given that such intelligence is a prerequisite to having a > half-baked parser, the global parameters are always > unnecessary. This is a minor point compared to the one below, and probably not an issue here, but I can't let the above stand. My impression of the MIME design, from the beginning, was that I should be able to inspect the content-type --including both the primary type and any parameters-- before deciding whether to retrieve and open the content of the body part. Yes, we have implementations and protocols that don't take advantage of that, or that can take advantage of the content-type only and not the parameters, but the MIME design, and hence the media type design, is that I should be able to tell whether I can parse (see below) the body part without opening it and applying heuristics to it. In addition, what you seem to mean by "intelligent enough to figure out the message structure independently" is what I would mean if I said "know most or all of the formats and apply heuristics to figure out which one to apply". We just don't do that, at least knowingly and with standards-track media types. You might rationally argue that this should be an exception if you could demonstrate that a reasonable set of heuristics would _always_ make the choice correctly, but that would require you to document either the heuristics or all of the variations on mbox and how to tell them apart (or both) -- something that several people have claimed is basically impossible and that you clearly haven't expressed a desire to do. In that context, unless I completely misunderstand what is going on here, the "...prerequisite to having a half-baked parser..." assertion borders on the silly. Take the example to which Tony has been pointing. Apparently the Solaris version of an mbox format is well-documented and based on content length information rather than key strings. That implies that, if (i) I know that what is coming is in that Solaris format and (ii) I have a rather primitive parser that knows how to find and deal with content lengths, then I can parse the format without any ability to "figure out the structure" at all. However, if I attempt to apply that parser to something to uses key strings, rather than lengths, or something that violates the assumptions Solaris makes about what gets included in the length computations, the parser is going to be bewildered at best and yield silly results at worst -- and that is exactly what content types and their parameters are intended to let receiving applications guard against. > Actually, global parameters are more than useless. What if we > have a mixed mbox file, where some messages are untagged BIG5 > and others are untagged 8859-1, or we have some messages have > VMS::Mail addresses and others have MS/Mail addresses, or so > forth? The global nature of global parameters ignores the > per-message reality of the mbox structure. > > Global parameters can also be harmful if they conflict with > reality. That brings us to the main problem, or misunderstanding, or strawman, depending on one's perspective. I may be being excessively dense here, but, if I am, I seem to have significant company. I am not asking (or suggesting) that you provide the information that would be required for multipart/, e.g., the content-type and associated parameters (such as charset, for your examples above) for each message (much less, for multipart/digest, that you force everything into the required subset of an RFC822 message body). While I have concerns that we didn't get multipart quite right and that it is too late to fix it, those concerns don't interact with my concern about application/mbox at all. So let's move back a half-step. To me, the essence of the mbox format, conceptually, is that it consists of a sequence of blobs that are normally interpreted as messages in some format or other. You've made several convincing arguments that we should see them as blobs, not as messages, and I (and I think others) accepted them long ago. I think that such a blob collection is a reasonable thing to want to mail or otherwise use as a media type. And, again, I accept your argument that the blobs may not be valid 822/2822 messages or encapsulated 2821 messages, and, indeed, that there may be considerable format and content heterogeneity from one blob to the next. Given that model, the key to an mbox format isn't the content of the blobs, it is the system used to decompose an mbox into a blob collection. If this were a multipart structure, the corresponding issue of interest would be whether the Boundary parameter provided enough information to separate the parts. It would not be what was in each of the parts or even what _their_ content types were. It would simply be the ability to separate the aggregate mbox into separate blobs. If I know that blobs are length-delimited according to some specific set of rules, I have that information and can build a trivial parser (and understand what it can't handle). If I know they are separated by indicator strings that obey some specific set of rules, then I can build a trivial parser (and understand what it can't handle). And so on for different sets of rules. But, as far as I can tell, you don't want to give us that information. Instead, you want us to accept "well, it is some sort of mbox format, and you either need to guess at how to separate the blobs by examining the content or you need to get that information out of band". If that is what you intend this specification to imply, IMnvHO, unacceptable for a standards-track media type. Finally, if you argue that global parameters are unacceptable as a means of making the type of format/ de-blob-ing distinctions outlined below, I suggest that you are not making a good case for leaving the parameters (and associated information) out. Instead, you are making a case that this registration should be a family of, e.g., application/mbox-solaris-v5-and-later application/mbox-sendmail-v2-v4 and similar things (examples made up). john _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf