On Thu, Feb 16, 2023 at 12:56 AM Torsten Bögershausen <tboegi@xxxxxx> wrote: > On Wed, Feb 15, 2023 at 10:16:21AM -0500, Jeff King wrote: > > +test_expect_success 'symbolic-ref --short handles complex utf8 case' ' > > + name="测试-加-增加-加-增加" && > > + git symbolic-ref TEST_SYMREF "refs/heads/$name" && > > + # In the real world, we saw problems with this case only > > + # when the locale includes UTF-8. Set it here to try to make things as > > + # hard as possible for us to pass, but in practice we should do the > > + # right thing regardless (and of course some platforms may not even > > + # have this locale). > > + LC_ALL=en_US.UTF-8 git symbolic-ref --short TEST_SYMREF >actual && > > + echo "$name" >expect && > > + test_cmp expect actual > > +' > > The other thing seems to be that there is a bug even with > LANG=C, see the response from Eric here: > > $ git symbolic-ref --short HEAD | xxd > 00000000: e6b5 8be8 af95 2de5 8a0a ......-... > $ LANG=C git symbolic-ref --short HEAD | xxd > 00000000: e6b5 8be8 af95 2de5 8aa0 2de5 a29e e58a ......-...-..... > 00000010: a02d e58a a02d e5a2 9ee5 8aa0 0a .-...-....... I'm confused. To what bug do you refer? In my tests, LANG=C seemed to sidestep the problem. > Does it make sense to > a) Use the local locale, what ever that is > b) Re-run with LC_ALL=en_US.UTF-8 > c) Re-run with LANG=C (that is where I had suspected problems when using UTF-8) In my tests, LANG=C is the only case which seemed to work correctly when the implementation used fscanf(). > d) Mention MacOs here ? Certainly, a good idea. > Somewhat in that style: > > test_expect_success 'symbolic-ref --short handles complex utf8 case' ' > name="测试-加-增加-加-增加" && > git symbolic-ref TEST_SYMREF "refs/heads/$name" && > # In the real world, we saw problems with this case only under MacOs Ventura I'm on ancient High Sierra (10.13) using HFS+, so the problem is not Ventura-specific. The original bug report did mention Ventura (which presumably is using APFS). > # when the locale includes UTF-8. Try it here to try to make things as > # hard as possible for us to pass, but in practice we should do the > # right thing regardless (and of course some platforms may not even > # have this locale). > # Use try even the default and LANG=C > git symbolic-ref --short TEST_SYMREF >actual && > echo "$name" >expect && > test_cmp expect actual && > LC_ALL=en_US.UTF-8 git symbolic-ref --short TEST_SYMREF >actual && > echo "$name" >expect && > test_cmp expect actual && > LANG=C git symbolic-ref --short TEST_SYMREF >actual && > echo "$name" >expect && > test_cmp expect actual > '