On Mon, Apr 06, 2020 at 03:17:34PM +0000, brian m. carlson wrote: > > I did wonder if there are any standards around 8bit headers. Certainly > > the de facto standard for local tools (e.g., mutt reading a message > > you've edited in vim) is that they can be treated like a stream of > > ASCII-compatible bytes, and that works pretty well in practice. But if > > there's an IETF-endorsed method for 8bit headers, it would be nice to > > use it. For 8bit bodies, we're able to give a content-transfer-encoding > > and a content-type with the charset. But I don't know of an equivalent > > for headers. > > That's RFC 6532, Internationalized Email Headers, the companion document > to RFC 6531. (The RFC editor has cleverly kept the last digits in sync > between the RFC 532x and 653x series). Ah, thanks, that's exactly what I was looking for. > The basic summary is that header field names are not internationalized, > but the field values do allow UTF-8 if they contain unstructured text > (e.g., Subject), anything using atoms (e.g., Message-ID), quoted strings > (e.g., local-parts of an email address), domains, and a few other > constructs. RFC 2047 (MIME encoded words) is allowed "only in a subset > of the places allowed by" RFC 6532, so just not encoding should be safe > here, as long as it's UTF-8. That makes sense. It looks like such messages are technically message/global rather than message/rfc822. But since there's no content-type given for the outermost message of an mbox, I guess that just becomes implied. The utf8 thing means that doing: git format-patch --encoding=iso8859-1 --no-encode-headers violates the standard. But I think that's OK. If you really prefer that charset for your local use, it does what you want. And if you try to send it over SMTP and somebody complains, I think that falls under "if it hurts, don't do that". -Peff