Hi, Christian and Hariom, I want to use this patch series as the temporary final version of GSOC project: https://github.com/adlternative/git/commits/cat-file-reuse-ref-filter-logic Due to the branch ref-filter-opt-code-logic or branch ref-filter-opt-perf patch series temporarily unable to reflect its optimization to git cat-file --batch. Therefore, using branch cat-file-reuse-ref-filter-logic is the most effective now. This is the final performance regression test result: Test upstream/master this tree ------------------------------------------------------------------------------------ 1006.2: cat-file --batch-check 0.06(0.06+0.00) 0.08(0.07+0.00) +33.3% 1006.3: cat-file --batch-check with atoms 0.06(0.04+0.01) 0.06(0.06+0.00) +0.0% 1006.4: cat-file --batch 0.49(0.47+0.02) 0.48(0.47+0.01) -2.0% 1006.5: cat-file --batch with atoms 0.48(0.44+0.03) 0.47(0.46+0.01) -2.1% git cat-file --batch has a performance improvement of about 2%. git cat-file --batch-check still has a performance gap of 33.3%. The performance degradation of git cat-file --batch-check is actually not very big. upstream/master (225bc32a98): $ hyperfine --warmup=10 "~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects" Benchmark #1: ~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects Time (mean ± σ): 596.2 ms ± 5.7 ms [User: 563.0 ms, System: 32.5 ms] Range (min … max): 586.9 ms … 607.9 ms 10 runs cat-file-reuse-ref-filter-logic (709a0c5c12): $ hyperfine --warmup=10 "~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects" Benchmark #1: ~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects Time (mean ± σ): 601.3 ms ± 5.8 ms [User: 566.9 ms, System: 33.9 ms] Range (min … max): 596.7 ms … 613.3 ms 10 runs The execution time of git cat-file --batch-check is only a few milliseconds away. But look at the execution time changes of git cat-file --batch: upstream/master (225bc32a98): $ time ~/git/bin-wrappers/git cat-file --batch --batch-all-objects >/dev/null /home/adl/git/bin-wrappers/git cat-file --batch --batch-all-objects > 24.61s user 0.30s system 99% cpu 24.908 total cat-file-reuse-ref-filter-logic (709a0c5c12): $ time ~/git/bin-wrappers/git cat-file --batch --batch-all-objects >/dev/null cat-file --batch --batch-all-objects > /dev/null 25.10s user 0.30s system 99% cpu 25.417 total The execution time has been reduced by nearly 0.5 seconds. Intuition tells me that the performance improvement of git cat-file --batch will be more important. In fact, git cat-file origin code directly adds the obtained object data to the output buffer; But after using ref-filter logic, it needs to copy the object data to the intermediate data (atom_value), and finally to the output buffer. At present, we cannot easily eliminate intermediate data, because git for-each-ref --sort has a lot of dependence on it, but we can reduce the overhead of copying or allocating memory as much as possible. I had an idea that I didn't implement before: partial data delayed evaluation. Or to be more specific, waiting until the data is about to be added to the output buffer, form specific output content, this may be a way to bypass the intermediate data. To be optimistic, I think this patch can be merged with the current performance of git cat-file --batch. Of course, this still needs more suggestions from reviewers. Thanks. -- ZheNing Hu