Re: grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 3, 2023 at 11:55 AM Jeff King <peff@xxxxxxxx> wrote:
> Just a guess, but does calling:
>
>   setlocale(LC_CTYPE, "");
>
> at the start of the program change things (you'll probably need to also
> include locale.h)?

Indeed, the new output is

    illegal byte sequence

For the following program

    #include <regex.h>
    #include <assert.h>
    #include <stddef.h>
    #include <stdio.h>
    #include <locale.h>

    int main(int argc, char **argv) {
        char *loc = setlocale(LC_CTYPE, "");
        assert (loc != NULL);
        regex_t re;
        int ret = regcomp(&re, "[\xc0-\xff][\x80-\xbf]+", REG_EXTENDED
| REG_NEWLINE);
        /* assert(ret != 0); */
        size_t errbuf_size = regerror(ret, &re, NULL, 0);
        char errbuf[errbuf_size];
        regerror(ret, &re, errbuf, errbuf_size);
        printf("%s\n", errbuf);
    }

My own locale output, for completion's sake:

    LANG="fr_FR.UTF-8"
    LC_COLLATE="fr_FR.UTF-8"
    LC_CTYPE="fr_FR.UTF-8"
    LC_MESSAGES="fr_FR.UTF-8"
    LC_MONETARY="fr_FR.UTF-8"
    LC_NUMERIC="fr_FR.UTF-8"
    LC_TIME="fr_FR.UTF-8"
    LC_ALL="fr_FR.UTF-8"


-- 
D. Ben Knoble



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux