"Diomidis Spinellis via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Diomidis Spinellis <dds@xxxxxxx> > > The 2013 commit 29de20504e9 fixed t0070-fundamental.sh under Darwin > (macOS) by adopting Git's regex library. However, this library is > compiled with NO_MBSUPPORT, which causes git-grep to work incorrectly > on multibyte (e.g. UTF-8) files. Current macOS versions pass > t0070-fundamental.sh with the native macOS regex library, which > also supports multibyte characters. Very nice to see the last sentence in that paragraph. > Signed-off-by: Diomidis Spinellis <dds@xxxxxxx> > --- This part, from here ... > grep: fix multibyte regex handling under macOS > > The 2013 commit 29de20504e9 fixed t0070-fundamental.sh under Darwin > (macOS) by adopting Git's regex library. However, this library is > compiled with NO_MBSUPPORT, which causes git-grep to work incorrectly on > multibyte (e.g. UTF-8) files. Current macOS versions pass > t0070-fundamental.sh with the native macOS regex library, which also > supports multibyte characters. > > Adjust the Makefile to use the native regex library, and call > setlocale(3) to set CTYPE according to the user's preference. The > setlocale(3) call is required on all platforms, but in platforms > supporting gettext(3), setlocale(3) is called as a side-effect of > initializing gettext(3). > > To avoid running the new tests on platforms still using the > compatibility library, which is compiled without multibyte support, > store the corresponding NO_REGEX setting in the GIT-BUILD-OPTIONS file. > This makes it available to the test scripts. In addition, adjust the > test-tool regex command to work with multibyte regexes to further test a > platform's support for them. > > Signed-off-by: Diomidis Spinellis dds@xxxxxxx ... up to here does not need a separate sign-off. It is primarily to describe what got changed between v1 and this version. The changes can be seen in range-diff, and in some cases what the range-diff shows is sufficient to understand the difference, but it is better to either (1) supply explanation in your own words, or (2) omit almost duplicate description. > diff --git a/Makefile b/Makefile > index 04d0fd1fe60..d1a98257150 100644 > --- a/Makefile > +++ b/Makefile > @@ -1427,7 +1427,6 @@ ifeq ($(uname_S),Darwin) > APPLE_COMMON_CRYPTO = YesPlease > COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO > endif > - NO_REGEX = YesPlease > PTHREAD_LIBS = > endif I notice that this is the ONLY conditional section in our primary Makefile that switches upon the value of $(uname_<anything>). After the dust settles, we should move this whole part to config.mak.uname as an unrelated clean-up. Alternatively, you can choose to do that (without losing NO_REGEX) as a preliminary clean-up patch and then build this "recent Darwin has usable regexp library" change on top of it. I do not have a preference either way, other than that we do not want to see that done as part of this change "while we are at it". > diff --git a/grep.c b/grep.c > index 82eb7da1022..c31657c3da3 100644 > --- a/grep.c > +++ b/grep.c > @@ -10,6 +10,8 @@ > #include "quote.h" > #include "help.h" > > +#include <locale.h> Don't ever include system headers in generic *.c files. The system headers on different platforms have subtle ordering restrictions among include files and definitions of feature test macros, and all those subtleties are supposed to be handled in one central place, in the "git-compat-util.h" file. A source file specific only to a particular platform is a different matter; they can include system headers that exist or are needed only on those systems directly. So far, "grep.c" has been successfully used for everybody (except Darwin), and if it needs touching, that means something is still wrong with Darwin. Is is because you are not using gettext()? I thought our gettext.[ch] did consider some folks/platforms do not use system gettext() but perhaps its support for such build environment is slightly lacking and does not make necessary setlocale() calls. If that is why you are mucking with this file, the right place to add changes like this is not here specific to grep.c, but to the start-up sequence that happens before cmd_main() inside common-main.c or somewhere around there, I think. Let me summon the author of 5e9637c6 (i18n: add infrastructure for translating Git with gettext, 2011-11-18) by Cc'ing.