Re: grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 1 Feb 2023 at 17:22, D. Ben Knoble <ben.knoble@xxxxxxxxx> wrote:
>
> On Wed, Feb 1, 2023 at 11:09 AM demerphq <demerphq@xxxxxxxxx> wrote:
> > FWIW that looks pretty weird to me, like the escapes in the charclass
> > were interpolated before being fed to the regex engine. Are you sure
> > you tested the right thing?
>
> Quite sure. `git diff --word-diff` fails. This was just a smaller
> example based on the linked C code.
>
> Here's the output of `git diff --word-diff` (verbatim and dumped):
>
> ```
> fatal : invalid regular expression: \|([^\\]*)\||([^][)(}{[
> ])+|[^[:space:]]|[¿-ˇ][Ä-ø]+
> 00000000: 6661 7461 6cc2 a03a 2069 6e76 616c 6964  fatal..: invalid
> 00000010: 2072 6567 756c 6172 2065 7870 7265 7373   regular express
> 00000020: 696f 6e3a 205c 7c28 5b5e 5c5c 5d2a 295c  ion: \|([^\\]*)\
> 00000030: 7c7c 285b 5e5d 5b29 287d 7b5b 2009 5d29  ||([^][)(}{[ .])
> 00000040: 2b7c 5b5e 5b3a 7370 6163 653a 5d5d 7c5b  +|[^[:space:]]|[
> 00000050: c02d ff5d 5b80 2dbf 5d2b 0a              .-.][.-.]+.
> ```

Interesting. The regex engine seems to be interpolating the \xC0 in
such a way you arent seeing the real pattern. In the Perl regex engine
I'd call that a bug (it used to do the same thing before we fixed it
years ago[1]). FWIW, this is a valid regex in Perl so i dont think the
pattern is at fault, its something else. I saw some discussion
recently that the mac regex engine doesn't play nicely in certain
ways, but i dont recollect the details.

Sorry i can't help more. Try searching for EXTENDED and regex and mac.
Maybe you can find the mail I mean.

cheers,
yves
[1] I am one of the maintainers of the perl regex engine.
-- 
perl -Mre=debug -e "/just|another|perl|hacker/"




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux