Re: [PATCH 2/2] for-each-ref: add --count-matches option

Phillip Wood <phillip.wood123@xxxxxxxxx> · Tue, 27 Jun 2023 11:05:29 +0100

On 27/06/2023 08:30, Jeff King wrote:
On Mon, Jun 26, 2023 at 03:09:57PM +0000, Derrick Stolee via GitGitGadget wrote:

+for pattern in "refs/heads/" "refs/tags/" "refs/remotes"
+do
+	test_perf "count $pattern: git for-each-ref | wc -l" "
+		git for-each-ref $pattern | wc -l
+	"
+
+	test_perf "count $pattern: git for-each-ref --count-match" "
+		git for-each-ref --count-matches $pattern
+	"
+done

I don't think this is a very realistic perf test, because for-each-ref
is doing a bunch of work to generate its default format, only to have
"wc" throw most of it away. Doing:

   git for-each-ref --format='%(refname)' | wc -l

That's a good point. I wondered if using a short fixed format string was 
even better so I tried

git init test
cd test
git commit --allow-empty -m initial
seq 0 100000 | sed "s:\(.*\):create refs/heads/some-prefix/\1 $(git 
rev-parse HEAD):" | git update-ref --stdin
git pack-refs --all
hyperfine -L fmt "","--format=%\(refname\)","--format=x" 'git 
for-each-ref {fmt} refs/heads/ | wc -l'

Which gives

Benchmark 1: git for-each-ref  refs/heads/ | wc -l
  Time (mean ± σ):      1.150 s ±  0.010 s    [User: 0.494 s, System: 
0.637 s]
  Range (min … max):    1.140 s …  1.170 s    10 runs

Benchmark 2: git for-each-ref --format=%\(refname\) refs/heads/ | wc -l
  Time (mean ± σ):      66.0 ms ±   0.3 ms    [User: 58.9 ms, System: 
9.5 ms]
  Range (min … max):    65.2 ms …  67.1 ms    43 runs

Benchmark 3: git for-each-ref --format=x refs/heads/ | wc -l
  Time (mean ± σ):      63.0 ms ±   0.5 ms    [User: 54.3 ms, System: 
9.6 ms]
  Range (min … max):    62.3 ms …  65.4 ms    44 runs

Summary
  git for-each-ref --format=x refs/heads/ | wc -l ran
    1.05 ± 0.01 times faster than git for-each-ref 
--format=%\(refname\) refs/heads/ | wc -l
   18.25 ± 0.20 times faster than git for-each-ref  refs/heads/ | wc -l

So on my somewhat slower machine the default format is over an order of 
magnitude slower than using either --format=%(refname) or --format=x and 
the short fixed format is marginally faster. I haven't applied stolee's 
patch but the 3 or 4 times improvement mentioned in the commit message 
seems likely to be from not processing the default format. One thing to 
note is that we're not comparing like-with-like when more than one 
pattern is given as --count-matches gives a separate count for each pattern.

I'm a bit suspicious of the massive speed up I'm seeing by avoiding the 
default format but it appears to be repeatable.

Best Wishes

Phillip

is much better (obviously you have to remember to do that if you care
about optimizing your command, but that's true of --count-matches, too).

Running hyperfine with three variants shows that the command above is
competitive with --count-matches, though slightly slower (hyperfine
complains about short commands and outliers because these runtimes are
so tiny in the first place; I omitted those warnings from the output
below for readability):

   Benchmark 1: ./git-for-each-ref refs/remotes/ | wc -l
     Time (mean ± σ):       6.1 ms ±   0.2 ms    [User: 3.0 ms, System: 3.6 ms]
     Range (min … max):     5.6 ms …   7.1 ms    397 runs

   Benchmark 2: ./git-for-each-ref --format="%(refname)" refs/remotes/ | wc -l
     Time (mean ± σ):       3.3 ms ±   0.2 ms    [User: 2.2 ms, System: 1.5 ms]
     Range (min … max):     3.0 ms …   4.0 ms    774 runs

   Benchmark 3: ./git-for-each-ref --count-matches refs/remotes/
     Time (mean ± σ):       2.4 ms ±   0.1 ms    [User: 1.5 ms, System: 0.9 ms]
     Range (min … max):     2.2 ms …   3.4 ms    1018 runs

   Summary
     './git-for-each-ref --count-matches refs/remotes/' ran
       1.33 ± 0.10 times faster than './git-for-each-ref --format="%(refname)" refs/remotes/ | wc -l'
       2.48 ± 0.17 times faster than './git-for-each-ref refs/remotes/ | wc -l'

I will note this is an unloaded multi-core system, which gives the piped
version a slight edge. Total CPU is probably more interesting than
wall-clock time, but all of these are so short that I think the results
should be taken with a pretty big grain of salt (I had to switch from
the "powersave" to "performance" CPU governor just to get more
consistent results).

-Peff