On Wed, Jun 15 2022, René Scharfe wrote: > Git uses zlib for its own object store, but calls gzip when creating tgz > archives. Add an option to perform the gzip compression for the latter > using zlib, without depending on the external gzip binary. > > Plug it in by making write_block a function pointer and switching to a > compressing variant if the filter command has the magic value "git > archive gzip". Does that indirection slow down tar creation? Not > really, at least not in this test: > > $ hyperfine -w3 -L rev HEAD,origin/main -p 'git checkout {rev} && make' \ > './git -C ../linux archive --format=tar HEAD # {rev}' Shameless plug: https://lore.kernel.org/git/211201.86r1aw9gbd.gmgdl@xxxxxxxxxxxxxxxxxxx/ I.e. a "hyperfine" wrapper I wrote to make exactly this sort of thing easier. You'll find that you need less or no --warmup with it, since the checkout flip-flopping and re-making (and resulting FS and other cache eviction) will go away, as we'll use different "git worktree"'s for the two "rev". (Also, putting those on a ramdisk really helps) > Benchmark #1: ./git -C ../linux archive --format=tar HEAD # HEAD > Time (mean ± σ): 4.044 s ± 0.007 s [User: 3.901 s, System: 0.137 s] > Range (min … max): 4.038 s … 4.059 s 10 runs > > Benchmark #2: ./git -C ../linux archive --format=tar HEAD # origin/main > Time (mean ± σ): 4.047 s ± 0.009 s [User: 3.903 s, System: 0.138 s] > Range (min … max): 4.038 s … 4.066 s 10 runs > > How does tgz creation perform? > > $ hyperfine -w3 -L command 'gzip -cn','git archive gzip' \ > './git -c tar.tgz.command="{command}" -C ../linux archive --format=tgz HEAD' > Benchmark #1: ./git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD > Time (mean ± σ): 20.404 s ± 0.006 s [User: 23.943 s, System: 0.401 s] > Range (min … max): 20.395 s … 20.414 s 10 runs > > Benchmark #2: ./git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD > Time (mean ± σ): 23.807 s ± 0.023 s [User: 23.655 s, System: 0.145 s] > Range (min … max): 23.782 s … 23.857 s 10 runs > > Summary > './git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD' ran > 1.17 ± 0.00 times faster than './git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD' > > So the internal implementation takes 17% longer on the Linux repo, but > uses 2% less CPU time. That's because the external gzip can run in > parallel on its own processor, while the internal one works sequentially > and avoids the inter-process communication overhead. > > What are the benefits? Only an internal sequential implementation can > offer this eco mode, and it allows avoiding the gzip(1) requirement. I had been keeping one eye on this series, but didn't look at it in any detail. I found this after reading 6/6, which I think in any case could really use some "why" summary, which seems to mostly be covered here. I.e. it's unclear if the "drop the dependency on gzip(1)" in 6/6 is a reference to the GZIP test dependency, or that our users are unlikely to have "gzip(1)" on their systems. If it's the latter I'd much rather (as a user) take a 17% wallclock improvement over a 2% cost of CPU. I mostly care about my own time, not that of the CPU. Can't we have our 6/6 cake much easier and eat it too by learning a "fallback" mode, i.e. we try to invoke gzip, and if that doesn't work use the "internal" one? Re the "eco mode": I also wonder how much of the overhead you're seeing for both that 17% and 2% would go away if you pin both processes to the same CPU, I can't recall the command offhand, but IIRC taskset or numactl can do that. I.e. is this really measuring IPC overhead, or I-CPU overhead on your system?