On Thu, Jun 07, 2018 at 09:09:25PM +0200, Ævar Arnfjörð Bjarmason wrote: > On Thu, Jun 07 2018, Matthew Wilcox wrote: > > If the first atom of a regex is a bracket expression with an inverted range, > > git grep is very slow. > > I have some WIP patches to fix all of this, which I'll hopefully submit > before 2.19 is out the door. > > What you've discovered here is how shitty your libc regex engine is, > because unless you provide -P and compile with a reasonably up-to-date > libpcre (preferably v2) with JIT that's what you'll get. I'm using Debian's build, and it is linked against a recent libpcre2: $ ldd /usr/lib/git-core/git libpcre2-8.so.0 => /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0 (0x00007f59ad5f2000) $ dpkg --status libpcre2-8-0 Version: 10.31-3 But I wasn't using -P. If I do, then I see the performance numbers you do: $ time git grep -P '[^t]truct_size' >/dev/null real 0m0.354s user 0m0.340s sys 0m0.639s $ time git grep -P 'struct_size' >/dev/null real 0m0.336s user 0m0.552s sys 0m0.457s $ time git grep 'struct_size' >/dev/null real 0m0.335s user 0m0.535s sys 0m0.474s > So you need to just use an up-to-date libpcre2 & -P and performance > won't suck. I don't tend to use terribly advanced regexps, so I'll just set grep.patternType to 'perl' and then it'll automatically be fast for me without your patches ;-) > My WIP patches will make us use PCRE for all grep modes, using an API it > has to convert basic & extended regexp syntax to its own syntax, so > we'll be able to do that transparently. That's clearly the right answer. Thanks!