Re: grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 1, 2023 at 6:03 PM Jeff King <peff@xxxxxxxx> wrote:
> So the regex engine is complaining that it is getting bytes with high
> bits set, but that are not part of a multi-byte character. I.e., it is
> not happy to do bytewise matching, but really wants valid UTF8 in the
> expression.

I did manage to find that the call to regcomp in diff.c's
init_diff_words_data (line 2212 in v2.39.1) is what crashes; I could
not step into it with gdb, however.

Further, the following C program compiles without warnings (except for
the unused main parameters):
```
#include <regex.h>
#include <assert.h>
#include <stddef.h>
#include <stdio.h>

int main(int argc, char **argv) {
    regex_t re;
    int ret = regcomp(&re, "[\xc0-\xff][\x80-\xbf]+", REG_EXTENDED |
REG_NEWLINE);
    /* assert(ret != 0); */
    size_t errbuf_size = regerror(ret, &re, NULL, 0);
    char errbuf[errbuf_size];
    regerror(ret, &re, errbuf, errbuf_size);
    printf("%s\n", errbuf);
}
```

```
# CFLAGS='-Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes
-Wold-style-definition -Wshadow -Wpointer-arith -Wcast-qual -pedantic
-std=c11'
# cc $CFLAGS regtest.c -o regtest && ./regtest
*** unknown regexp error code ***
```
(the assertion fails because regcomp succeeds!)

So I can neither find out what's to blame nor what to fix. Here are
the linked libraries on macOS (IIUC):
```
# otool -L regtest
regtest:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
version 1311.0.0)
# otool -L ./git-diff # from v2.39.1 source build today
./git-diff:
/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices
(compatibility version 1.0.0, current version 1141.1.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
/usr/local/opt/gettext/lib/libintl.8.dylib (compatibility version
12.0.0, current version 12.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
version 1311.0.0)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
(compatibility version 150.0.0, current version 1856.105.0)
```

-- 
D. Ben Knoble



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux