On Thu, Feb 20, 2014 at 4:34 PM, Daniel Verite <daniel@xxxxxxxxxxxxxxxx> wrote:
Well overall with the discussion so far and whatever search I could over net/community it looks like there is no code page on windows corresponding to what is utf8 of linux. If there is then please let me know?
Regards...
Despite windows-1252 being a monobyte encoding sharing most
of LATIN1 codes and character set, it does not mean that
English_United States.1252 is limited to this character set.
You may use UTF-8 databases with that locale.
Consider the 2nd paragraph of "Character Set Support"
in the doc:
http://www.postgresql.org/docs/current/static/multibyte.html
"For C or POSIX locale, any character set is allowed, but for other
locales there is only one character set that will work
correctly. (On Windows, however, UTF-8 encoding can be used with
any locale.)"
This is a key difference with Unix when choosing a locale.
As for getting the exact same sort order than Linux, it's not possible but
that's not a Windows-vs-Unix issue. If you used FreeBSD or MacOS X, some
en_US.UTF-8 collation rules would differ from Linux's libc too, resulting in
a different sort order for certain strings.
There is no issue of using windows-1252 with utf8 database. The point of discussion here is sorting order and windows code page for utf8?
The links http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx which I provided earlier has those code pages but creating database with these code pages fail.
Well overall with the discussion so far and whatever search I could over net/community it looks like there is no code page on windows corresponding to what is utf8 of linux. If there is then please let me know?
Conclusion: I have basically decided to have the database encoding UTF8 on both windows and linux. And then set the collation to 'C'.
At least my customers on linux and windows sees the same behavior when sorting. Any gotchas here?
Regards...