Re: [PATCH v2 0/7] Sparse Index: integrate with reset

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Tue, 05 Oct 2021 17:34:48 +0200

On Tue, Oct 05 2021, Victoria Dye via GitGitGadget wrote:

> The p2000 tests demonstrate an overall ~70% execution time reduction across
> all tested usages of git reset using a sparse index:

[...]

> Test                                               before   after       
> ------------------------------------------------------------------------
> 2000.22: git reset (full-v3)                       0.48     0.51 +6.3% 
> 2000.23: git reset (full-v4)                       0.47     0.50 +6.4% 
> 2000.24: git reset (sparse-v3)                     0.93     0.30 -67.7%
> 2000.25: git reset (sparse-v4)                     0.94     0.29 -69.1%
> 2000.26: git reset --hard (full-v3)                0.69     0.68 -1.4% 
> 2000.27: git reset --hard (full-v4)                0.75     0.68 -9.3% 
> 2000.28: git reset --hard (sparse-v3)              1.29     0.34 -73.6%
> 2000.29: git reset --hard (sparse-v4)              1.31     0.34 -74.0%
> 2000.30: git reset -- does-not-exist (full-v3)     0.54     0.51 -5.6% 
> 2000.31: git reset -- does-not-exist (full-v4)     0.54     0.52 -3.7% 
> 2000.32: git reset -- does-not-exist (sparse-v3)   1.02     0.31 -69.6%
> 2000.33: git reset -- does-not-exist (sparse-v4)   1.07     0.30 -72.0%

This series looks like it really improves some cases, but at the cost of
that -70% improvement we've got a ~5% regression in 7/7 for the full-v3
--does-not-exist cases. As noted in your 7/7 (which improves all other
cases):

    (full-v3)     0.79(0.38+0.30)   0.91(0.43+0.34) +15.2%
    (full-v4)     0.80(0.38+0.29)   0.85(0.40+0.35) +6.2%

Which b.t.w. I had to read a couple of times before realizig that its
quoted:

    Test          before            after
    ------------------------------------------------------
    (full-v3)     0.79(0.38+0.30)   0.91(0.43+0.34) +15.2%
    (full-v4)     0.80(0.38+0.29)   0.85(0.40+0.35) +6.2%
    (sparse-v3)   0.76(0.43+0.69)   0.44(0.08+0.67) -42.1%
    (sparse-v4)   0.71(0.40+0.65)   0.41(0.09+0.65) -42.3%

Is just the does-not-exist part of this bigger table, are the other
cases all ~0% changed, or ...?

Anyway, until 7/7 the v3 had been sped up, but a ~10% increase landed us
at ~+6%, and full-v4 had been ~0% but got ~6% worse?

Is there a way we can get those improvements in performance without
regressing on the full-* cases?

Also, these tests only check sparse performance, but isn't some of the
code being modified here general enough to not be used exclusively by
the sparse mode, full checkout cone or not?

It looks fairly easy to extend p2000-sparse-operations.sh to run the
same tests but just pretend that it's running in a "full" mode without
actually setting up anyting sparse-specific (the meat of those tests
just runs "git status" etc. How does that look with this series?

Since only the CL and 7/7 quote numbers from p2000, and 7/7 is at least
a partial regression, it would be nice to have perf numbers on each
commit (if only as a one-off for ML consumption). Are there any more
improvements followed by regressions followed by improvements as we go
along? Would be useful to know...