ABNF Re: Troubles with UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dave

Is this an ok use of RFC4234?  Reading it, I am not clear whether U+FEFF should
be
specified as %xFE %xFF or whether %xFFEF is ok?  And what is the ABNF for any
possible ISO 10646 character, all 97000 of them?

Tom Petch

----- Original Message -----
From: "Ned Freed" <ned.freed@xxxxxxxxxxx>
To: "TomPetch" <sisyphus@xxxxxxxxxxxxxx>
Cc: "ietf" <ietf@xxxxxxxx>
Sent: Friday, December 23, 2005 7:13 PM
Subject: Re: Troubles with UTF-8
<snip>

> > B) Code point. Many standards are defined in ABNF [RFC4234] which allows
code
> > points to be specified as, eg,  %b00010011 %d13 or %x0D none of which are
> > terribly Unicode-like (U+000D).  The result is standards that use one
notation
> > in the ABNF and a different one in the body of the document; should ABNF
allow
> > something closer to Unicode (as XML has done with &#000D;)?
>
> ABNF is charset-independent, mapping onto non-negative integers, not
> characters. Nothing prevents a specification from saying that a given ABNF
> grammar specifies a series of Unicode characters represented in UTF-8 and
using
> %xFEFF or whatever in the grammar itself.
>
<snip>


_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]