Re: universal character name

Ian Lance Taylor <iant@xxxxxxxxxx> · Tue, 12 Oct 2010 08:05:07 -0700

Patrick Horgan <phorgan1@xxxxxxxxx> writes:

> I was building a test program that I wrote a long time ago to check a
> utf-8 code conversion facet, and it will no longer build because the
> line that checks to see if "\xc2\x80" really converts to "\u0080"
> doesn't compile.  This is entirely correct, of course since the
> current draft standard in section 2.3.2 says that any program with
> universal character names in the control character ranges of 0x00â0x1F
> or 0x7Fâ0x9F is ill-formed.  That means I can't say "\u0080". Still,
> the characters are valid unicode, and it makes it more annoying.
>
> Is "\x0\x80" the equivalent, or can there be endianess issues with
> wide chars?

I don't understand your question.  I don't know which standard you are
referring to.  There shouldn't be any restriction on using control
character ranges in string constants.  Are you trying to use these
characters in an identifier name?  That doesn't sound like something you
would need for a code conversion facet.

You said you wanted to use UTF-8, but \x0\x80 is not UTF-8.  There can
indeed be endianness issues with wide chars but it really depends on
various things you haven't told us.

Can you clarify what you are asking?  It may help to show a bit of
code.

Ian