On Fri, Oct 27 2017, Joe Perches jotted: > On Thu, 2017-10-26 at 10:45 -0700, Stefan Beller wrote: >> On Thu, Oct 26, 2017 at 10:41 AM, Joe Perches <joe@xxxxxxxxxxx> wrote: >> > On Thu, 2017-10-26 at 09:58 -0700, Stefan Beller wrote: >> > > + Avar who knows a thing about pcre (I assume the regex compilation >> > > has impact on grep speed) >> > > >> > > On Thu, Oct 26, 2017 at 8:02 AM, Joe Perches <joe@xxxxxxxxxxx> wrote: >> > > > Comparing a cache warm git grep vs command line grep >> > > > shows significant differences in cpu & wall clock. >> > > > >> > > > Any ideas how to improve this? >> > > > >> > > > $ time git grep "\bseq_.*%p\W" | wc -l >> > > > 112 >> > > > >> > > > real 0m4.271s >> > > > user 0m15.520s >> > > > sys 0m0.395s >> > > > >> > > > $ time grep -r --include=*.[ch] "\bseq_.*%p\W" * | wc -l >> > > > 112 >> > > > >> > > > real 0m1.164s >> > > > user 0m0.847s >> > > > sys 0m0.314s >> > > > >> > > >> > > I wonder how much is algorithmic advantage vs coding/micro >> > > optimization that we can do. >> > >> > As do I. I presume this is libpcre related. >> > >> > For instance, git grep performance is better than grep for: >> > >> > $ time git grep -w "seq_printf" -- "*.[ch]" | wc -l >> > 8609 >> > >> > real 0m0.301s >> > user 0m0.548s >> > sys 0m0.372s >> > >> > $ time grep -w -r --include=*.[ch] "seq_printf" * | wc -l >> > 8609 >> > >> > real 0m0.706s >> > user 0m0.396s >> > sys 0m0.309s >> > >> >> One important piece of information is what version of Git you are running, >> >> >> $ git tag --contains origin/ab/pcre-v2 >> v2.14.0 > > v2.10 > >> ... >> >> (and the version of pcre, see the numbers) >> https://git.kernel.org/pub/scm/git/git.git/commit/?id=94da9193a6eb8f1085d611c04ff8bbb4f5ae1e0a > > I definitely didn't have that one. > > I recompiled git latest (with USE_LIBPCRE2) and reran. > > Here are the results > > $ git --version > git version 2.15.0.rc2.48.g4e40fb3 > > $ time git grep -P "\bseq_.*%p\W" -- "*.[ch]" | wc -l > 112 > > real 0m0.437s > user 0m1.008s > sys 0m0.381s > > So, git grep performance has already been > quite successfully improved. ...and I have WIP patches to use the PCRE engine for patterns without -P which I intend to start sending soon after the next release.