On Wed, 1 Feb 2023 at 16:25, D. Ben Knoble <ben.knoble@xxxxxxxxx> wrote: > > I recently updated to git 2.39.1 and noticed today that `git diff > --word-diff` fails for files with `diff=scheme`. I was able to narrow > the failure down to the inclusion of control characters \xc0, \xff, > \x80, \xbf by https://github.com/git/git/blob/2fc9e9ca3c7505bc60069f11e7ef09b1aeeee473/userdiff.c#L17 > in the definition of the scheme diff pattern (really, all patterns). > > I suspect the commit referenced in the subject, given that it messes > with regex handling on macOS. > > Relevant environment that I can think of: > ``` > # locale > LANG="fr_FR.UTF-8" > LC_COLLATE="fr_FR.UTF-8" > LC_CTYPE="fr_FR.UTF-8" > LC_MESSAGES="fr_FR.UTF-8" > LC_MONETARY="fr_FR.UTF-8" > LC_NUMERIC="fr_FR.UTF-8" > LC_TIME="fr_FR.UTF-8" > LC_ALL="fr_FR.UTF-8" > ``` > > I'm on macOS 11.7. > > Failure (using Zsh to produce the characters; I think there's a Bash > equivalent): > ``` > # git diff --word-diff --word-diff-regex=$'[\xc0-\xff][\x80-\xbf]+' > fatal¬†: invalid regular expression: [¿-ˇ][Ä-ø]+ > ``` FWIW that looks pretty weird to me, like the escapes in the charclass were interpolated before being fed to the regex engine. Are you sure you tested the right thing? Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"