Re: UTF-8, UTF-16 and UTF-32

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is a solution that will please everyone and your stance for not doing it is at it break the ABI, but haven't we learnt anything for the 2/4 byte int type debacle of several decades ago - why would you want to go through that all over again.



You argue why only GCC, although MSVC++ is using 2-byte wchar_t, Borland C++ Builder has a policy of conforming to MSVC++ and most likely already uses 2-byte wchar_t, Sun Studio will most like bend to the market reality and that will leave GCC.



My preferred Solution: -



Standardise: - sizeof(char) = 1; sizeof(wchar_t) = 2; and sizeof(long wchar_t) = 4.

Implement all the string functions: - strcmp(); mbscmp(); wcscmp(); and lcscmp().



In ASCII C++ source files: -

"String" returns type char

L"String" returns type wchar_t

LL"String" returns type long wchar_t



In UTF-8 C++ source files: -

"String" return type unsigned char

L"String" returns type wchar_t

LL"String" returns type long wchar_t



In UTF-16 C++ source files: -

A"String" returns type unsigned char

"String" returns type wchar_t

LL"String" returns type long wchar_t



In UTF-32 C++ source files: -

A"String" returns type unsigned char

L"String" returns type wchar_t

"String" returns type long wchar_t



In this solution there is something for everyone, the Chinese can write their source code in visible Mandarin in UTF-16 or UTF-32, not in hexadecimal ASCII. The Europeans can save a few bytes by writing in UTF-8. We can all process files in any of the Unicode text formats from any OS. No one need to implement dodgy string conversion routines that must allocate memory and not release it. We can use constant string in function parameters - such as strcmp(string,"answer"), rather then allocating and initialising vectors every time.



Why not support all three Unicode formats? If it breaks the ABI, then the ABI needs to be broken. We are all responsible for our own actions and lettings someone else make bad decisions for us, we are just as liable as if we made the decision ourselves.



Dallas.

http://www.ekkySoftware.com/

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux