[Cc:ing Benjamin Kramer & René Scharfe because they both worked on the REG_STARTEND code in grep.c that I replace in this iteration of the patch series] This patch series addresses a problem where `git diff` is called using `-G` or `-S --pickaxe-regex` on new-born files that are configured without user diff drivers, and that hence get mmap()ed into memory. The problem with that: mmap()ed memory is *not* NUL-terminated, yet the pickaxe code calls regexec() on it just the same. This problem has been reported by my colleague Chris Sidi. We solve this by introducing a helper, regexec_buf(), that takes a pointer and a length instead of a NUL-terminated string. This helper then uses REG_STARTEND where available, and falls back to allocating and constructing a NUL-terminated string. Given the wide-spread support for REG_STARTEND (Linux has it, MacOSX has it, Git for Windows has it because it uses compat/regex/ that has it), I think this is a fair trade-off. Changes since v3: - reworded the onelines as per Junio's suggestions. - removed fallback when REG_STARTEND is not supported, in favor of requiring NO_REGEX. - removed the regmatch() function from grep.c, in favor of using regexec_buf(). Johannes Schindelin (3): regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails regex: add regexec_buf() that can work on a non NUL-terminated string regex: use regexec_buf() Makefile | 3 ++- diff.c | 3 ++- diffcore-pickaxe.c | 18 ++++++++---------- git-compat-util.h | 13 +++++++++++++ grep.c | 14 ++------------ t/t4061-diff-pickaxe.sh | 22 ++++++++++++++++++++++ xdiff-interface.c | 13 ++++--------- 7 files changed, 53 insertions(+), 33 deletions(-) create mode 100755 t/t4061-diff-pickaxe.sh Published-As: https://github.com/dscho/git/releases/tag/mmap-regexec-v4 Fetch-It-Via: git fetch https://github.com/dscho/git mmap-regexec-v4 Interdiff vs v3: diff --git a/Makefile b/Makefile index df4f86b..c6f7f66 100644 --- a/Makefile +++ b/Makefile @@ -301,7 +301,8 @@ all:: # crashes due to allocation and free working on different 'heaps'. # It's defined automatically if USE_NED_ALLOCATOR is set. # -# Define NO_REGEX if you have no or inferior regex support in your C library. +# Define NO_REGEX if your C library lacks regex support with REG_STARTEND +# feature. # # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the # user. diff --git a/git-compat-util.h b/git-compat-util.h index 627ec5f..8aab0c3 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -977,25 +977,17 @@ void git_qsort(void *base, size_t nmemb, size_t size, #define qsort git_qsort #endif +#ifndef REG_STARTEND +#error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd" +#endif + static inline int regexec_buf(const regex_t *preg, const char *buf, size_t size, size_t nmatch, regmatch_t pmatch[], int eflags) { -#ifdef REG_STARTEND assert(nmatch > 0 && pmatch); pmatch[0].rm_so = 0; pmatch[0].rm_eo = size; return regexec(preg, buf, nmatch, pmatch, eflags | REG_STARTEND); -#else - char *buf2 = xmalloc(size + 1); - int ret; - - memcpy(buf2, buf, size); - buf2[size] = '\0'; - ret = regexec(preg, buf2, nmatch, pmatch, eflags); - free(buf2); - - return ret; -#endif } #ifndef DIR_HAS_BSD_GROUP_SEMANTICS diff --git a/grep.c b/grep.c index d7d00b8..1194d35 100644 --- a/grep.c +++ b/grep.c @@ -898,17 +898,6 @@ static int fixmatch(struct grep_pat *p, char *line, char *eol, } } -static int regmatch(const regex_t *preg, char *line, char *eol, - regmatch_t *match, int eflags) -{ -#ifdef REG_STARTEND - match->rm_so = 0; - match->rm_eo = eol - line; - eflags |= REG_STARTEND; -#endif - return regexec(preg, line, 1, match, eflags); -} - static int patmatch(struct grep_pat *p, char *line, char *eol, regmatch_t *match, int eflags) { @@ -919,7 +908,8 @@ static int patmatch(struct grep_pat *p, char *line, char *eol, else if (p->pcre_regexp) hit = !pcrematch(p, line, eol, match, eflags); else - hit = !regmatch(&p->regexp, line, eol, match, eflags); + hit = !regexec_buf(&p->regexp, line, eol - line, 1, match, + eflags); return hit; } -- 2.10.0.windows.1.10.g803177d base-commit: f6727b0509ec3417a5183ba6e658143275a734f5