On Thu, Aug 8, 2024 at 6:10 AM Thomas Munro <thomas.munro@xxxxxxxxx> wrote:
Thanks. The log didn't offer any more clues, and my colleague David R
has Windows and knows how to work its debugger so we sat down together
and chased this down (thanks David!).
1. It is indeed calling abort(), but it's not a PANIC or Assert() in
PostgreSQL, it's an assertion inside Windows' own setlocale():
minkernel\crts\ucrt\src\appcrt\convert\mbstowcs.cpp(245) : Assertion
failed: (pwcs == nullptr && sizeInWords == 0) || (pwcs != nullptr &&
sizeInWords > 0)
2. It is indeed confused about the encoding of the string
"Turkish_Türkiye.1254" itself, and works fine if you use "tr-TR".
3. It doesn't happen on 15, because 16 added a key ingredient:
commit bf03cfd162176d543da79f9398131abc251ddbb9
Author: Peter Eisentraut <peter@xxxxxxxxxxxxxx>
Date: Tue Jan 3 14:21:40 2023 +0100
Windows support in pg_import_system_collations
That causes it to spin through a bunch of system locales and switch to
them, and the first one is "aa". After it calls:
setlocale(2, "aa");
... then the next call to restore the previous locale is something like:
setlocale(2, "Turkish_T\252rkiye.1254");
(That \252 == 0xfc probably depends on your system's default
encoding.) It doesn't like that name anymore, and aborts. A minimal
program with just those two lines shows that.
It appears that after switching to "aa", it interprets the string
passed to the next call to setlocale() as some other encoding
(probably UTF-8, I dunno). I don't know why it doesn't fail and
return NULL, but there is a more general point that it's a bit bonkers
to use non-ASCII byte sequences in the library calls that are used to
control how non-ASCII byte sequences are interpreted. Maybe it can be
done if you're careful, but in particular a naive save-and-restore
sequence just won't work.
I guess a save-and-restore done with wsetlocale() could fix that. But
I decline to work on that, we need less Windows kludgery in the tree,
not more. I think a better answer is "don't do that".
Really, we *have* to chase all these non-BCP-47 locales out of the
installer (I hope you can work on that?),
yeah, It seems getlocales.cpp needs to be changed to achieve it.
I'll look into it
I'll look into it
out of PostgreSQL (testers
wanted[1]), and out of the world's existing clusters (maybe with
Dave's pg_upgrade idea, someone would need to write a patch, or maybe
someone could write a stand-alone locale migration program that just
connects to a cluster and (using some authoritative source, that's the
key bit to research) and replaces bad old names with nice new ones).
[1] https://www.postgresql.org/message-id/flat/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw@xxxxxxxxxxxxxx
Sandeep Thakkar