On Mon, Sep 03, 2018 at 06:48:54PM +0200, Ævar Arnfjörð Bjarmason wrote: > > And there are definitely a few nasty bits (like the way the progress is > > ended). I'm not planning on taking this further for now, but maybe > > you or somebody can find it interesting or useful. > > I think it would be really nice if this were taken further. Using my > perf test in > https://public-inbox.org/git/20180903144928.30691-7-avarab@xxxxxxxxx/T/#u > I get these results: > > $ GIT_PERF_LARGE_REPO=/home/aearnfjord/g/linux GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run HEAD~ HEAD p1450-fsck.sh > [...] > Test HEAD~ HEAD > ---------------------------------------------------------------- > 1450.1: fsck 384.18(381.63+2.53) 301.52(508.28+38.34) -21.5% > > > I.e. this gives a 20% speedup, although of course some of that might be > because some of this might be skipping too much work, but looks really > promising. I'm pretty sure it's doing the correct thing, in terms of doing all the right checks. But look at your CPU time. You're getting a 20% wall-clock speedup, but spending a lot more CPU. So the main difference is really the multi-threading in index-pack. It should be strictly worse in terms of total CPU on a single-processor system because we're doing work in the sub-process (so we pay for the process invocation, but also we probably are unable to share things like in-memory commit structs, wasting a little extra time). So I'm on the fence on whether it is worth it. I like getting rid of the duplicated code. But on the other hand it is not all that complex, and maybe when it comes to things like fsck it is good to have a different implementation than the one that writes the .idx out in the first place. -Peff