RE: [RFC PATCH] Windows: Assume all file names to be UTF-8 encoded.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



===Re:===
Yes, but I am unsure whether it [UTF-8] can be set as a thread locale
for the sake 
of file APIs.
===end===

Why wouldn't it?  If the ANSI forms simply allocate buffers and call
WideCharToMultiByte and MultiByteToWideChar, it should work with
anything those functions handles.  My only concern would be with buffer
length when converting to MultiByte, if it assumes a limit based on 2
bytes max per character.  But, it works with GB18030, which can have
4-byte characters.

It's certainly easy enough to try.

===Re:===
Yeah, but unfortunately it [GB18030] is explicitly documented that it is
only 
supported in MultiByteToWideChar, WideCharToMultiByte and some text
painting 
APIs in Windows, i.e the stdio functions and others may break horribly.
===end===

Code that works with the other multi-byte "ANSI" character sets, and GBK
in particular, will handle GB18030 "reasonably well" with no changes.
For example, printf ("xxx%sxxx", name), where each 'x' may actually be
any character, will work without problems -- it won't mis-identify the %
in the middle of a 4-byte character.  But printf ("%5s",name) will count
some of the characters in 'name' as two, and print less than 5 of them;
or worse yet, break a character in half.

I can't think of anything that breaks horribly.  Only situations that
involve counting them will have issues.

As empirical evidence, lots of Windows software works fine in China.
You need full GB18030 support to read a newspaper on the web, because
the 4-byte characters are mostly obscure and regional words, but also
proper nouns including the names of some prominent people (Prime
Minister or something like that; I don't remember exactly).  But mostly
you don't encounter them and chug along with GBK and the occasional '?'
where some character did not work.

--John

TradeStation Group, Inc. is a publicly-traded holding company (NASDAQ GS: TRAD) of three operating subsidiaries, TradeStation Securities, Inc. (Member NYSE, FINRA, SIPC and NFA), TradeStation Technologies, Inc., a trading software and subscription company, and TradeStation Europe Limited, a United Kingdom, FSA-authorized introducing brokerage firm. None of these companies provides trading or investment advice, recommendations or endorsements of any kind. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux