Re: [PATCH] ref-filter: disable save_commit_buffer while traversing

Jeff King <peff@xxxxxxxx> · Mon, 11 Jul 2022 13:47:55 -0400

On Mon, Jul 11, 2022 at 05:12:37PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Hrm, so for doing the format we're leaving some performance on the table
> as we're currently not making use of this cache, so this makes nothing
> worse on that front.
> 
> But doesn't this approach then also close the door on using the same
> cache for performance improvements in that area? I.e. spotting that
> we've already parsed that commit, so we can get it from the cache?

Yes, it does close that door, or at least make it more challenging. But
I suspect it's not a very fruitful door in the first place:

  - it only helps at all if your sort field or format dereferences the
    tag to get to the commit

  - it only helps if you actually use the traversal options

  - done naively, it's a tradeoff. You might traverse a million commits,
    but display only a handful of them. Is using all of that memory
    worth avoiding re-inflating a few commits? Certainly you can come up
    with a pathological case, but I doubt that it helps in practice.

  - you _could_ do it less naively by caching only the commits directly
    pointed to by refs. But the depths of the traversals don't know what
    those are. So you'd probably do it by pre-loading those commits
    before any traversal, which would work just fine before or after my
    patch. But I probably wouldn't because of the extra code complexity,
    and because...

  - if the commit is in a commit-graph file, then pre-loading it is
    actively harmful (because you might not need it anyway!). So really,
    the best performance will come from just having a commit graph. I
    don't see the point in writing more code to micro-optimize the
    fallback case. For this patch, it's just that the existing code is
    being actively stupid.

> B.t.w. did you try to benchmark this with --no-contains too, I tried e.g.:
> 
>     ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"

I didn't, but doing it now showed the same 5% speedup.

> Which gives me:
> 
> 	$ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' -w 1 
> 	Benchmark 1: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD~1
> 	  Time (mean ± σ):      1.437 s ±  0.107 s    [User: 1.252 s, System: 0.082 s]
> 	  Range (min … max):    1.306 s …  1.653 s    10 runs
> 	 
> 	Benchmark 2: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD
> 	  Time (mean ± σ):      1.335 s ±  0.044 s    [User: 1.230 s, System: 0.050 s]
> 	  Range (min … max):    1.260 s …  1.417 s    10 runs
> 	 
> 	Summary
> 	  './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD' ran
> 	    1.08 ± 0.09 times faster than './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD~1'

Isn't that showing that it's slightly faster with my patch? It's not as
fast as your "just --contains" example, but the exact improvement will
depend on how many commits your particular query has to traverse (and
from the improved overall time, it seems that "just --contains" is
traversing less). And I'd expect a fair bit of noise here. This really
isn't making anything faster; it's just using less memory, so any
speedup is coming from things like less-full memory caches.

BTW, I picked "--contains 1da177e4c3" in linux.git because I thought it
would have to traverse a lot (because that's the oldest commit), but it
actually isn't the worst case. Just asking about "--contains HEAD"
requires more traversal, but still shows a speed improvement (7.45s
before my patch, 7.26s after).

But I was curious about the heap savings there, since the numbers should
be larger. And indeed they are. Here's peak heap for two runs:

  1093721590 ./git.old --no-pager tag --contains HEAD
   218077941 ./git.new --no-pager tag --contains HEAD

Whoops, that is indeed 1GB of heap, which is what Olaf was seeing
(though I don't know what repo he's using, it's roughly proportional to
the number of commits). And 200MB afterwards. So indeed, I'd expect this
patch (or the hackier version I showed earlier) to make a significant
dent in his workload.

Of course the even better solution is to use commit-graphs. That drops
the memory usage to a few megabytes, and the time to only a few
milliseconds (because generation numbers let us avoid most of the
traversal).

-Peff