On 2017-09-26 15:09, Carsten Bormann wrote:
o A protocol SHOULD NOT forbid use of U+FEFF as a signature for
those textual protocol elements for which the protocol does not
provide character encoding identification mechanisms, when a ban
would be unenforceable, or when it is expected that
implementations of the protocol will not be in a position to
always use the mechanisms properly. The latter two cases are
likely to occur with larger protocol elements such as MIME
entities, especially when implementations of the protocol will
obtain such entities from file systems, from protocols that do not
have encoding identification mechanisms for payloads (such as FTP)
or from other protocols that do not guarantee proper
identification of character encoding (such as HTTP).
...which is *exactly* what we're discussing here?
This is only the pertaining case if we want that to be so.
The “protocol” could be “RFCs are in UTF-8”. Done.
If the "protocol" was that, than of course a BOM wouldn't be needed. But
it is not.
I agree that if the goal was to promote an all-unicode world, the answer would be different. But the goal of the RFC Editor is to deliver documents that people will be able to read properly with the tools they have.
The browsers already work with Unicode, so that goal is already achieved.
"With the tools they have" as in "they are getting the files (http
download, rsync, ftp, whatnot) and open them locally with whatever is
registered for *.txt".
But there is a fundamental disconnect here: I *don’t* agree that the goal of the RFC editor is to maximize the viewing pleasure for RFCs. I believe we are creating RFCs to make the network work better, and we should focus on that goal when making decisions. The objective to move on to Unicode (UTF-8) has been in the IETF DNA for two decades now, and it is sad when there are places where we are losing sight of this.
It is indeed a fundamental disconnect.
I recommend that you review the discussion that got us where we are, and
I believe you'd be pleased that we actually got to the point where we
could convince people to move away from plain ASCII.
Best regards, Julian