Re: [idn] Re: FYI: BOF on Internationalized Email Addresses (IEA)

John Cowan <jcowan@xxxxxxxxxxxxxxxxx> · Wed, 29 Oct 2003 14:01:33 -0500

Dave Crocker scripsit:

> Oh?  You mean that Unicode does not fit directly -- ie, with no special
> encoding rules -- into 32 bits, or 24 bits, or somesuch.

Nope.  The Unicode character set maps characters to integers.  How the
integers are mapped to bytes is defined by the encoding rules, of which
there are seven standard ones:  UTF-8, UTF-16, UTF-16BE, UTF-16LE,
UTF-32, UTF-32BE, UTF-32LE.  All have equal status.

> That's the difference between native representation, versus "encoding".

There is no native representation in the sense you mean.  All
representations are equal.

-- 
De plichten van een docent zijn divers,         John Cowan
die van het gehoor ook.                         jcowan@xxxxxxxxxxxxxxxxx
      --Edsger Dijkstra                         http://www.ccil.org/~cowan