Mark, >> (Another >> jibe, citing the fact that utf-8 is, itself, a modification to "raw" unicode >> is probably worth repeating, here.) MD> When Unicode is expressed as a series of bytes, there are a number of equally MD> valid sncoding schemes (aka serializations). UTF-8 is one of those schemes, and MD> is no more or less a "modification", and no more or less "Unicode" than any MD> other of these schemes. That's right. It is an "encoding". Raw Unicode takes more than 8-bits. Lots more. UTF-8 is a method of encoding those raw bits into a non-raw form. So is the ACE approach. My point was that folks tend to talk about UTF-8 as if it were the raw representation, rather than a derivative encoding. In fact, UTF-8 is exactly parallel to the ACE approach. It might be a more efficient encoding, but it is no more "native" or "direct" or "raw" than ACE. d/ -- Dave Crocker <dcrocker-at-brandenburg-dot-com> Brandenburg InternetWorking <www.brandenburg.com> Sunnyvale, CA USA <tel:+1.408.246.8253>