On 11/21/2021 10:32 PM, Han Xin wrote: > From: Han Xin <hanxin.hx@xxxxxxxxxxxxxxx> > > We used to call "get_data()" in "unpack_non_delta_entry()" to read the > entire contents of a blob object, no matter how big it is. This > implementation may consume all the memory and cause OOM. > > By implementing a zstream version of input_stream interface, we can use > a small fixed buffer for "unpack_non_delta_entry()". > > However, unpack non-delta objects from a stream instead of from an entrie > buffer will have 10% performance penalty. Therefore, only unpack object > larger than the "big_file_threshold" in zstream. See the following > benchmarks: > > $ hyperfine \ > --prepare 'rm -rf dest.git && git init --bare dest.git' \ > 'git -C dest.git unpack-objects <binary_320M.pack' > Benchmark 1: git -C dest.git unpack-objects <binary_320M.pack > Time (mean ± σ): 10.029 s ± 0.270 s [User: 8.265 s, System: 1.522 s] > Range (min … max): 9.786 s … 10.603 s 10 runs > > $ hyperfine \ > --prepare 'rm -rf dest.git && git init --bare dest.git' \ > 'git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack' > Benchmark 1: git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack > Time (mean ± σ): 10.859 s ± 0.774 s [User: 8.813 s, System: 1.898 s] > Range (min … max): 9.884 s … 12.192 s 10 runs It seems that you want us to compare this pair of results, and hyperfine can assist with that by including multiple benchmarks (with labels, using '-n') as follows: $ hyperfine \ --prepare 'rm -rf dest.git && git init --bare dest.git' \ -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \ -n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \ -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack' Benchmark 1: old Time (mean ± σ): 20.835 s ± 0.058 s [User: 14.510 s, System: 6.284 s] Range (min … max): 20.741 s … 20.909 s 10 runs Benchmark 2: new Time (mean ± σ): 26.515 s ± 0.072 s [User: 19.783 s, System: 6.696 s] Range (min … max): 26.419 s … 26.611 s 10 runs Benchmark 3: new (small threshold) Time (mean ± σ): 26.523 s ± 0.101 s [User: 19.805 s, System: 6.680 s] Range (min … max): 26.416 s … 26.739 s 10 runs Summary 'old' ran 1.27 ± 0.00 times faster than 'new' 1.27 ± 0.01 times faster than 'new (small threshold)' (Here, 'old' is testing a compiled version of the latest 'master' branch, while 'new' has your patches applied on top.) Notice from this example I had a pack with many small objects (mostly commits and trees) and I see that this change introduces significant overhead to this case. It would be nice to understand this overhead and fix it before taking this change any further. Thanks, -Stolee