On 20/09/2017 07:19, Julian Reschke wrote: > On 2017-09-19 20:35, Ted Lemon wrote: >> On Sep 19, 2017, at 1:16 PM, Julian Reschke <julian.reschke@xxxxxx >> <mailto:julian.reschke@xxxxxx>> wrote: >>> Can you please point to *something* that says it's wrong to use the >>> BOM in UTF-8 encoded documents of type text/plain? >> >> It's pretty clearly wrong to download a document that's labeled >> "text/plain;charset=utf8" and then store it in a way that will result in >> it being treated as having a different encoding, or to display it >> directly using a different encoding, as Explorer does. Since the BOM >> is not required or even encouraged by the Unicode Consortium, failing to >> get this right is clearly a bug. > > I'm pretty sure that if the browser modified the downloaded file, > somebody else would claim that *that* would be a big. For instance, it > would affect signatures. Yes. If you consider that the exact content of a file is what you want when you transfer it from another machine, hidden changes to the content are plain wrong. Storing metadata with the file is another matter - in an ideal world you would store "text/plain;charset=utf8" as part of the metadata. But that isn't the world we live in; all we store for sure is the filename, which may include the string ".txt", which may mean the same as "text/plain" but certainly doesn't imply "charset=utf8". In any case, regardless of how we believe UTF-8 *strings* should be embedded in protocols, the decision to prepend the equivalent of "charset=utf8" to a file containing a UTF-8 string is not a protocol issue. Modest suggestion: store BOM-free versions named rfc8187.utx etc. Start a trend that tool implementers can follow. Brian