Re: [PATCH 2/2] for-each-ref: add --count-matches option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/06/2023 11:05, Phillip Wood wrote:
On 27/06/2023 08:30, Jeff King wrote:
I don't think this is a very realistic perf test, because for-each-ref
is doing a bunch of work to generate its default format, only to have
"wc" throw most of it away. Doing:

   git for-each-ref --format='%(refname)' | wc -l

That's a good point. I wondered if using a short fixed format string was even better so I tried

git init test
cd test
git commit --allow-empty -m initial
seq 0 100000 | sed "s:\(.*\):create refs/heads/some-prefix/\1 $(git rev-parse HEAD):" | git update-ref --stdin
git pack-refs --all
hyperfine -L fmt "","--format=%\(refname\)","--format=x" 'git for-each-ref {fmt} refs/heads/ | wc -l'

Which gives
[...] Summary
   git for-each-ref --format=x refs/heads/ | wc -l ran
    1.05 ± 0.01 times faster than git for-each-ref --format=%\(refname\) refs/heads/ | wc -l
    18.25 ± 0.20 times faster than git for-each-ref  refs/heads/ | wc -l
[...] I'm a bit suspicious of the massive speed up I'm seeing by avoiding the default format but it appears to be repeatable.

Having seen Peff's mail [1] I realized that my test repo above is looking up the commit from a loose object. If I repack the repository then the default format is still slower than using "--format=%(refname)" but is much more competitive.

$ git repack -a
Enumerating objects: 2, done.
Counting objects: 100% (2/2), done.
Writing objects: 100% (2/2), done.
Total 2 (delta 0), reused 0 (delta 0), pack-reused 0

$ hyperfine -L fmt "","--format=%\(refname\)","--format=x" 'git for-each-ref {fmt} refs/heads/ | wc'
Benchmark 1: git for-each-ref  refs/heads/ | wc -l
Time (mean ± σ): 111.4 ms ± 1.4 ms [User: 96.9 ms, System: 19.6 ms]
  Range (min … max):   109.6 ms … 115.1 ms    25 runs

Benchmark 2: git for-each-ref --format=%\(refname\) refs/heads/ | wc -l
Time (mean ± σ): 66.7 ms ± 0.7 ms [User: 59.5 ms, System: 9.5 ms]
  Range (min … max):    65.6 ms …  68.2 ms    42 runs

Benchmark 3: git for-each-ref --format=x refs/heads/ | wc -l
Time (mean ± σ): 63.4 ms ± 0.7 ms [User: 56.3 ms, System: 8.0 ms]
  Range (min … max):    61.9 ms …  65.1 ms    44 runs

Summary
  git for-each-ref --format=x refs/heads/ | wc -l ran
1.05 ± 0.02 times faster than git for-each-ref --format=%\(refname\) refs/heads/ | wc -l
    1.76 ± 0.03 times faster than git for-each-ref  refs/heads/ | wc -l

So it seems most of the slowdown I was seeing yesterday was due it looking up a loose object. I'm surprised repacking makes such a difference in a repository that only contains two objects.

Best Wishes

Phillip

[1] https://lore.kernel.org/git/20230627195900.GC1280909@xxxxxxxxxxxxxxxxxxxxxxx



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux