Dave Is this an ok use of RFC4234? Reading it, I am not clear whether U+FEFF should be specified as %xFE %xFF or whether %xFFEF is ok? And what is the ABNF for any possible ISO 10646 character, all 97000 of them? Tom Petch ----- Original Message ----- From: "Ned Freed" <ned.freed@xxxxxxxxxxx> To: "TomPetch" <sisyphus@xxxxxxxxxxxxxx> Cc: "ietf" <ietf@xxxxxxxx> Sent: Friday, December 23, 2005 7:13 PM Subject: Re: Troubles with UTF-8 <snip> > > B) Code point. Many standards are defined in ABNF [RFC4234] which allows code > > points to be specified as, eg, %b00010011 %d13 or %x0D none of which are > > terribly Unicode-like (U+000D). The result is standards that use one notation > > in the ABNF and a different one in the body of the document; should ABNF allow > > something closer to Unicode (as XML has done with �D;)? > > ABNF is charset-independent, mapping onto non-negative integers, not > characters. Nothing prevents a specification from saying that a given ABNF > grammar specifies a series of Unicode characters represented in UTF-8 and using > %xFEFF or whatever in the grammar itself. > <snip> _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf