On Tue, Jun 29, 2021 at 10:04:56PM -0400, Jeff King wrote: > In the warm-cache case, the improvement seems to go away (or maybe I'm > holding it wrong; even in the cold-cache case I don't get anywhere near > as impressive a speedup as you showed above). Which isn't to say that > cold-cache isn't sometimes important, and this may still be worth doing. > But it really seems like the CPU involve in walking over the file isn't > actually that much. Hmm. I think that you might be holding it wrong, or at least I'm able to reproduce some substantial improvements in the warm cache case with limited traversals. Here are a few runs of the same hyperfine invocation, just swapping the `--prepare` which drops the disk cache with `--warmup 3` which populates them. $ hyperfine \ 'GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2' \ 'GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2' \ --warmup 3 Benchmark #1: GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2 Time (mean ± σ): 23.1 ms ± 6.4 ms [User: 9.4 ms, System: 13.6 ms] Range (min … max): 13.8 ms … 35.8 ms 161 runs Benchmark #2: GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2 Time (mean ± σ): 11.2 ms ± 1.8 ms [User: 7.5 ms, System: 3.7 ms] Range (min … max): 4.7 ms … 12.6 ms 238 runs Swapping just loading an individual commit to look at all branches, I get the following 2.01x improvement: Benchmark #1: GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index --branches Time (mean ± σ): 22.5 ms ± 5.8 ms [User: 8.5 ms, System: 14.0 ms] Range (min … max): 14.1 ms … 34.9 ms 157 runs Benchmark #2: GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index --branches Time (mean ± σ): 11.2 ms ± 1.9 ms [User: 7.1 ms, System: 4.1 ms] Range (min … max): 4.7 ms … 13.4 ms 239 runs But there are some diminishing returns when I include --tags, too, since I assume that there is some more traversal involved in older parts of the kernel's history which aren't as well covered by bitmaps. But it's still an improvement of 1.17x (give or take .31x, according to hyperfine). Benchmark #1: GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index --branches --tags Time (mean ± σ): 66.8 ms ± 12.4 ms [User: 43.6 ms, System: 23.1 ms] Range (min … max): 54.4 ms … 92.3 ms 39 runs Benchmark #2: GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index --branches --tags Time (mean ± σ): 57.3 ms ± 10.9 ms [User: 37.5 ms, System: 19.8 ms] Range (min … max): 44.0 ms … 79.5 ms 45 runs > I got an even more curious result when adding in "--not --all" (which > the connectivity check under discussion would do). There the improvement > from your patch should be even less, because we'd end up reading most of > the bitmaps anyway. But I got: Interesting. Discussion about that series aside, I go from 28.6ms without reading the table to 35.1ms reading it, which is much better in absolute timings, but much worse relatively speaking. I can't quite seem to make sense of the perf report on that command. Most of the time is spent faulting pages in, but most of the time measured in the "git" object is in ubc_check. I don't really know how to interpret that, but I'd be curious if you had any thoughts. I was just looking at: $ GIT_READ_COMMIT_TABLE=1 perf record -F997 -g \ git.compile rev-list --count --objects \ --use-bitmap-index 2ab38c17aac --not --all $ perf report Thanks, Taylor