Re: [PATCH v2 3/3] shorten_unambiguous_ref(): avoid sscanf()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 16, 2023 at 12:56 AM Torsten Bögershausen <tboegi@xxxxxx> wrote:
> On Wed, Feb 15, 2023 at 10:16:21AM -0500, Jeff King wrote:
> > +test_expect_success 'symbolic-ref --short handles complex utf8 case' '
> > +     name="测试-加-增加-加-增加" &&
> > +     git symbolic-ref TEST_SYMREF "refs/heads/$name" &&
> > +     # In the real world, we saw problems with this case only
> > +     # when the locale includes UTF-8. Set it here to try to make things as
> > +     # hard as possible for us to pass, but in practice we should do the
> > +     # right thing regardless (and of course some platforms may not even
> > +     # have this locale).
> > +     LC_ALL=en_US.UTF-8 git symbolic-ref --short TEST_SYMREF >actual &&
> > +     echo "$name" >expect &&
> > +     test_cmp expect actual
> > +'
>
> The other thing seems to be that there is a bug even with
> LANG=C, see the response from Eric here:
>
> $ git symbolic-ref --short HEAD | xxd
> 00000000: e6b5 8be8 af95 2de5 8a0a                 ......-...
> $ LANG=C git symbolic-ref --short HEAD | xxd
> 00000000: e6b5 8be8 af95 2de5 8aa0 2de5 a29e e58a  ......-...-.....
> 00000010: a02d e58a a02d e5a2 9ee5 8aa0 0a         .-...-.......

I'm confused. To what bug do you refer? In my tests, LANG=C seemed to
sidestep the problem.

> Does it make sense to
> a) Use the local locale, what ever that is
> b) Re-run with LC_ALL=en_US.UTF-8
> c) Re-run with LANG=C (that is where I had suspected problems when using UTF-8)

In my tests, LANG=C is the only case which seemed to work correctly
when the implementation used fscanf().

> d) Mention MacOs here ?

Certainly, a good idea.

> Somewhat in that style:
>
> test_expect_success 'symbolic-ref --short handles complex utf8 case' '
>         name="测试-加-增加-加-增加" &&
>         git symbolic-ref TEST_SYMREF "refs/heads/$name" &&
>         # In the real world, we saw problems with this case only under MacOs Ventura

I'm on ancient High Sierra (10.13) using HFS+, so the problem is not
Ventura-specific. The original bug report did mention Ventura (which
presumably is using APFS).

>         # when the locale includes UTF-8. Try it here to try to make things as
>         # hard as possible for us to pass, but in practice we should do the
>         # right thing regardless (and of course some platforms may not even
>         # have this locale).
>         # Use try even the default and LANG=C
>         git symbolic-ref --short TEST_SYMREF >actual &&
>         echo "$name" >expect &&
>         test_cmp expect actual &&
>         LC_ALL=en_US.UTF-8 git symbolic-ref --short TEST_SYMREF >actual &&
>         echo "$name" >expect &&
>         test_cmp expect actual &&
>         LANG=C git symbolic-ref --short TEST_SYMREF >actual &&
>         echo "$name" >expect &&
>         test_cmp expect actual
> '



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux