Hi, this is my first patch series taking an actual look at write performance for the reftable backend. This series addresses two major pain points: - Duplicate directory/file conflict checks when writing refs. - Allocation churn when compressing log blocks. Overall though I found that there is not much of a point to investigate write performance in the reftable library itself, at least not right now. This is mostly because the write performance is heavily dominated by random ref reads. And while past patch series have optimized scanning through refs linearly, seeking random refs isn't well-optimized yet. So once all in-flight series relating to reftable performance have landed I will focus on random ref reads next. For the bigger picture, the following benchmarks show perfomance compared to the "files" backend after applying this patch series. Writing many refs in a single transaction: Benchmark 1: update-ref: create many refs (refformat = files, refcount = 100000) Time (mean ± σ): 10.085 s ± 0.057 s [User: 1.876 s, System: 8.161 s] Range (min … max): 10.013 s … 10.202 s 10 runs Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 100000) Time (mean ± σ): 2.768 s ± 0.018 s [User: 1.381 s, System: 1.383 s] Range (min … max): 2.745 s … 2.804 s 10 runs Summary update-ref: create many refs (refformat = reftable, refcount = 100000) ran 3.64 ± 0.03 times faster than update-ref: create many refs (refformat = files, refcount = 100000) And for writing many refs sequentially in separate transactions: Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 10000) Time (mean ± σ): 40.286 s ± 0.086 s [User: 22.241 s, System: 17.912 s] Range (min … max): 40.166 s … 40.410 s 10 runs Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 10000) Time (mean ± σ): 44.046 s ± 0.137 s [User: 23.790 s, System: 20.146 s] Range (min … max): 43.813 s … 44.301 s 10 runs Summary update-ref: create refs sequentially (refformat = files, refcount = 10000) ran 1.09 ± 0.00 times faster than update-ref: create refs sequentially (refformat = reftable, refcount = 10000) This is to the best of my knowledge last area where the "files" backend outperforms the "reftable" backend. This is partially also due to the fact that writes perform auto-compaction with the "reftable" backend. Patrick Patrick Steinhardt (9): refs/reftable: fix D/F conflict error message on ref copy refs/reftable: perform explicit D/F check when writing symrefs refs/reftable: skip duplicate name checks refs/reftable: don't recompute committer ident reftable/writer: refactorings for `writer_add_record()` reftable/writer: refactorings for `writer_flush_nonempty_block()` reftable/block: reuse zstream when writing log blocks reftable/block: reuse compressed array reftable/writer: reset `last_key` instead of releasing it refs/reftable-backend.c | 80 ++++++++++++++++++------- reftable/block.c | 83 ++++++++++++++++---------- reftable/block.h | 4 ++ reftable/writer.c | 119 ++++++++++++++++++++++++------------- t/t0610-reftable-basics.sh | 35 ++++++++++- 5 files changed, 227 insertions(+), 94 deletions(-) -- 2.44.GIT
Attachment:
signature.asc
Description: PGP signature