Re: How to build gcc to support wchar_t and wstring on Cygwin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Paul and Qihong,

> 1. Is it possible to have Cygwin 1.5 or 1.7 to support wstring?

Yes.  It's possible.  But not easy.

To support wstring, your platform needs to support a whole bunch of wchar_t
functionality.  q.v. Standard C++ IOStreams and Locales by Langer and Kreft
<http://www.amazon.com/dp/0201183951>.

It's really not enough to say "Hey, we've got wchar_t, and wchar_t character
traits, so we can turn on std::wstring."  The platform also needs to provide
support for the C++ wide character I/O:  wistream, wostream, facet, cvt,
locale.  And that's where things become problematic.

For Cygwin on Windows, the platform is Cygwin (and its amazingly cool POSIX
API layer) moreso than the WinAPI.

ASIDE:  I'm not sure what the MinGW situation is, but since MinGW uses
WinAPI directly rather than having a more Unix-like POSIX API as the
approach, MinGW may not be in this Cygwin situation.

Windows does support that functionality, *but* it appears that no one has
gone through the considerable effort of plumbing up Cygwin to use those
facilities.  (Those are the FooW routines, rather than the FooA routines, in
the WinAPI.  And there is a lot of work to get the locale magic to work
correctly.)

Also, Windows uses a wchar_t of 2 bytes.  (Previous UCS-2, from the Unicode
1.0 era.  Now with Vista and Win7, it's UTF-16.)  It would be a bit retro to
have Cygwin use a 2 byte wchar_t, rather than a 4 byte wchar_t.  That just
makes everything a little more difficult.

> 2. If it's possible how to build entire gcc or just libstdc++ if that works.

That won't work.  Because there is a considerable effort to plumb up the C++
I/O streams to the provided platform API's facilities for wide character
support.  And I do not believe that Cygwin's POSIX layer provides that wide
character support.  [I may be mistaken.]

> Seems like Mingw works. But I can't use that.

Ahh, cool.  That answers my above aside.

> My program have to run on both Cygwin and Linux.

Then for Cygwin, use char and std::string, and have your strings be UTF-8.

Or see if you can roll your own wide character type, and wide character
strings...

namespace myown
{
typedef unsigned int Utf32; // UTF-32 encoding unit
typedef std::basic_string<Utf32> String32; // UTF-32
// Probably need to provide your own character traits for Utf32.
}

...which if you do not use the I/O facilities, wouldn't need all the locale,
facet, cvt, and other mojo.

I'm not sure if the typedef unsigned int Utf32; will incur other problems.
If it does incur undue problem due to collisions because of the unsigned int
aliasing, you can the struct wrapper trick (as per Stroustrup's C++PL
11.7.1):

struct Utf32
{
  unsigned int m;
public:
  explicit Utf32(unsigned int in) : m(in) { }
  operator unsigned int() const { return m; }
};

That will make Utf32 a distinct type.

HTH,
--Eljay



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux