----- Original Message ----- From: "Ned Freed" <ned.freed@xxxxxxxxxxx> To: "TomPetch" <sisyphus@xxxxxxxxxxxxxx> Cc: "Ned Freed" <ned.freed@xxxxxxxxxxx>; "ietf" <ietf@xxxxxxxx> Sent: Sunday, December 25, 2005 12:35 AM Subject: Re: Troubles with UTF-8 > > Presented with a comparable problem where > > XML is in use, one WG has chosen to use an illegal XML sequence as a terminator > > so what I was fishing for is if there were any parallels with UTF-8, which has > > many illegal sequences of octets and so it would be easy to choose one as a > > terminator. > > Using a construct that's syntactically illegal at a higher protocol level > is one thing - I still wouldn't do it, but it is arguagly OK. Using a sequence > of octets that's not allowed by the underlying charset, OTOH, is a really > bad idea. For one thing, various agents do perform syntax checks on charset > data, so this is bound to cause major problems. And for another, such sequences > are going to be specific to a particular character encoding scheme, which > will make agents that transcode from, say, UTF-8 to UTF-16 pretty unhappy. > > If Unicode data needs to be self-terminated I strongly recommend using > NUL to do it. > > Ned The Unicode data I am thinking of may have come from an upper layer protocol and needs to be passed transparently (as with an error or hello message, identity even); it may or may not already be NUL-terminated (ever had that security foul-up where some userid/password are entered/stored NUL-terminated and some are not?) - hence I see the need to terminate the string in some other way, or to escape or in some other way transfer encode (parts of) the string. I looked at existing RFC, found many different approaches, all viable but none that really said to me 'this is good engineering, this is best practice'. Hence, floating the issue to see if there were any better ones out there. I think not, which is of itself worth knowing. Tom Petch _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf