On Tue, Mar 29, 2022 at 5:04 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > > On Tue, Mar 29 2022, Neeraj K. Singh via GitGitGadget wrote: > > > V4 changes: > > > > * Make ODB transactions nestable. > > * Add an ODB transaction around writing out the cached tree. > > * Change update-index to use a more straightforward way of managing ODB > > transactions. > > * Fix missing 'local's in lib-unique-files > > * Add a per-iteration setup mechanism to test_perf. > > * Fix camelCasing in warning message. > > Despite my > https://lore.kernel.org/git/220329.86czi52ekn.gmgdl@xxxxxxxxxxxxxxxxxxx/ > I eventually gave up on trying to extract meaningful numbers from > t/perf, I can never quite find out if they're because of its > shellscripts shenanigans or actual code. > > (And also; I realize I didn't follow-up on > https://lore.kernel.org/git/CANQDOdcFN5GgOPZ3hqCsjHDTiRfRpqoAKxjF1n9D6S8oD9--_A@xxxxxxxxxxxxxx/, > sorry): > Looks like we aren't actually hitting fsync in the numbers you expressed there, if they're down in the 20ms range. Or we simply aren't adding enough files. Or if that's against a ramdisk, the ramdisk doesn't have enough cost to represent real disk hardware. > But I came up with this (uses my thin > https://gitlab.com/avar/git-hyperfine/ wrapper, and you should be able > to apt get hyperfine): > > #!/bin/sh > set -xe > > if ! test -d /tmp/scalar.git > then > git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git > mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack > fi > git hyperfine \ > --warmup 1 -r 3 \ > -L rev neeraj-v4,avar-RFC \ > -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt' \ > -p 'rm -rf repo/.git/objects/* repo/.git/index' \ > $@'./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' > > git hyperfine \ > --warmup 1 -r 3 \ > -L rev neeraj-v4,avar-RFC \ > -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/' \ > -p 'rm -rf repo/.git/objects/* repo/.git/index' \ > $@'./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' > > git hyperfine \ > --warmup 1 -r 3 \ > -L rev neeraj-v4,avar-RFC \ > -s 'make CFLAGS=-O3' \ > -p 'git init --bare dest.git' \ > -c 'rm -rf dest.git' \ > $@'./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' > > Those tags are your v4 here & the v2 of the RFC I sent at > https://lore.kernel.org/git/RFC-cover-v2-0.7-00000000000-20220323T140753Z-avarab@xxxxxxxxx/ > > Which shows my RFC v2 is ~20% faster with: > > $ PFX='strace' ~/g/git.meta/benchmark.sh "strace " > > 1.22 ± 0.02 times faster than 'strace ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4' > 1.22 ± 0.01 times faster than 'strace ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4' > 1.00 ± 0.01 times faster than 'strace ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4' > > But only for add/update-index, is the unpack-objects not using the > tmp-objdir? (presumably yes). > > As noted before I've found "strace" to be a handy way to "simulate" > slower FS ops on a ramdisk (I get about the same numbers sometimes on > the actual non-SSD disk, but due to load on the system (that I'm not in > full control of[1]) I can't get hyperfine to be happy with the > non-fuzzyness: > At least in this case, I don't think 'strace' is representative of what a real disk would behave like. Unless you can somehow make strace of sync_file_range cost less than strace of fsync. > 1.06 ± 0.02 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4' > 1.06 ± 0.03 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4' > 1.01 ± 0.01 times faster than './git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4' > > FWIW these are my actual non-fuzzy-with-strace numbers on the > not-ramdisk, as you can see the intervals overlap, but for the first two > the "min" time is never close to the RFC v2: > > $ XDG_RUNTIME_DIR=/tmp/ghf ~/g/git.meta/benchmark.sh > + test -d /tmp/scalar.git > + git hyperfine --warmup 1 -r 3 -L rev neeraj-v4,avar-RFC -s make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt -p rm -rf repo/.git/objects/* repo/.git/index ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt > Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4 > Time (mean ± σ): 1.043 s ± 0.143 s [User: 0.184 s, System: 0.193 s] > Range (min … max): 0.943 s … 1.207 s 3 runs > > Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'avar-RFC > Time (mean ± σ): 877.6 ms ± 183.4 ms [User: 197.9 ms, System: 149.4 ms] > Range (min … max): 697.8 ms … 1064.4 ms 3 runs > > Summary > './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'avar-RFC' ran > 1.19 ± 0.30 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4' > + git hyperfine --warmup 1 -r 3 -L rev neeraj-v4,avar-RFC -s make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ -p rm -rf repo/.git/objects/* repo/.git/index ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add . > Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4 > Time (mean ± σ): 1.019 s ± 0.057 s [User: 0.213 s, System: 0.194 s] > Range (min … max): 0.963 s … 1.076 s 3 runs > > Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'avar-RFC > Time (mean ± σ): 918.6 ms ± 34.4 ms [User: 207.8 ms, System: 164.1 ms] > Range (min … max): 880.6 ms … 947.5 ms 3 runs > > Summary > './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'avar-RFC' ran > 1.11 ± 0.07 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4' > + git hyperfine --warmup 1 -r 3 -L rev neeraj-v4,avar-RFC -s make CFLAGS=-O3 -p git init --bare dest.git -c rm -rf dest.git ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack > Benchmark 1: ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4 > Time (mean ± σ): 1.362 s ± 0.285 s [User: 1.021 s, System: 0.186 s] > Range (min … max): 1.192 s … 1.691 s 3 runs > > Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. > > Benchmark 2: ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'avar-RFC > ⠏ Performing warmup runs ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ⠙ Performing warmup runs ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ Time (mean ± σ): 1.188 s ± 0.009 s [User: 1.025 s, System: 0.161 s] > Range (min … max): 1.180 s … 1.199 s 3 runs > > Summary > './git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'avar-RFC' ran > 1.15 ± 0.24 times faster than './git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4' > > 1. I do my git hacking on a bare metal box I rent with some friends, and > one of them is running one those persistent video game daemons > written in Java. So I think all my non-RAM I/O numbers are > continually fuzzed by what players are doing in Minecraft or whatever > that thing is... Thanks for the numbers. So if I'm understanding correctly, the difference on a real disk between quarantine and non-quarantine is 20% or so on your system? I did my own experiment by adding a 'batch-no-quarantine' method. No quarantine was slightly faster. * For 'git add' I found a very small difference (.29s vs 30s). * For 'git stash' it was a bit bigger (.35s vs.55s). This is with perf-lib, so we're just looking at min-times. On the other hand, classic fsync is 1.04s for 'git add' and 1.21s for 'git stash', all with 500 tiny blobs. FYI, this is measured on my laptop running Ubuntu in WSL. I don't think it's worth having a knob for no-quarantine for this small delta. I believe a better use of time for a follow-on change would be to implement an appendable pack format for newly-added objects. Thanks, Neeraj