On Thu, Jul 29, 2021 at 03:58:34AM +0000, Elijah Newren via GitGitGadget wrote: > This series is more about strmaps & memory pools than merge logic. CC'ing > Peff since he reviewed the strmap work[1], and that work included a number > of decisions that specifically had this series in mind. I haven't been following the other optimization threads very closely, but I'll try to give my general impressions. > === Basic Optimization idea === > > In this series, I make use of memory pools to get faster allocations and > deallocations for many data structures that tend to all be deallocated at > the same time anyway. > > === Results === > > For the testcases mentioned in commit 557ac0350d ("merge-ort: begin > performance work; instrument with trace2_region_* calls", 2020-10-28), the > changes in just this series improves the performance as follows: > > Before Series After Series > no-renames: 204.2 ms ± 3.0 ms 198.3 ms ± 2.9 ms > mega-renames: 1.076 s ± 0.015 s 661.8 ms ± 5.9 ms > just-one-mega: 364.1 ms ± 7.0 ms 264.6 ms ± 2.5 ms Pretty good results for the mega-renames case. I do wonder how much this matters in practice. That case is intentionally stressing the system, though I guess it's not too far-fetched (it's mostly a big directory rename). However, just "git checkout" across the rename already takes more than a second. So on the one hand, 400ms isn't nothing. On the other, I doubt anybody is likely to notice in the grand scheme of things. And we're paying a non-trivial cost in code complexity to do it (though I do think you've done an admirable job of making that cost as low as possible). Dropping the USE_MEMORY_POOL flag and just always using a pool would make a lot of that complexity go away. I understand how it makes leak hunting harder, but I think simpler code would probably be worth the tradeoff (and in a sense, there _aren't_ leaks in an always-pool world; we're holding on to all of the memory through the whole operation). I assume your tests are just done using the regular glibc allocator. I also wondered how plugging in a better allocator might fare. Here are timings I did of your mega-renames case with three binaries: one built with USE_MEMORY_POOL set to 0, one with it set to 1, and one with it set to 0 but adding "-ltcmalloc" to EXTLIBS via config.mak. $ hyperfine \ -p 'git checkout hwmon-updates && git reset --hard fd8bdb23b91876ac1e624337bb88dc1dcc21d67e && git checkout 5.4-renames^0' \ -L version nopool,pool,tcmalloc \ './test-tool.{version} fast-rebase --onto HEAD base hwmon-updates' Benchmark #1: ./test-tool.nopool fast-rebase --onto HEAD base hwmon-updates Time (mean ± σ): 921.1 ms ± 146.0 ms [User: 843.0 ms, System: 77.5 ms] Range (min … max): 660.9 ms … 1112.2 ms 10 runs Benchmark #2: ./test-tool.pool fast-rebase --onto HEAD base hwmon-updates Time (mean ± σ): 635.4 ms ± 125.5 ms [User: 563.7 ms, System: 71.3 ms] Range (min … max): 496.8 ms … 856.7 ms 10 runs Benchmark #3: ./test-tool.tcmalloc fast-rebase --onto HEAD base hwmon-updates Time (mean ± σ): 727.3 ms ± 139.9 ms [User: 654.1 ms, System: 72.9 ms] Range (min … max): 476.3 ms … 900.5 ms 10 runs Summary './test-tool.pool fast-rebase --onto HEAD base hwmon-updates' ran 1.14 ± 0.32 times faster than './test-tool.tcmalloc fast-rebase --onto HEAD base hwmon-updates' 1.45 ± 0.37 times faster than './test-tool.nopool fast-rebase --onto HEAD base hwmon-updates' The pool allocator does come out ahead when comparing means, but the improvement is within the noise (and the fastest run was actually with tcmalloc). I was also curious about peak heap usage. According to massif, the pool version peaks at ~800k extra (out of 82MB), which is negligible. Plus it has fewer overall allocations, so it seems to actually save 4-5MB in malloc overhead (though I would imagine that varies between allocators; I'm just going from massif numbers here). So...I dunno. It's hard to assess the real-world impact of the speedup, compared to the complexity cost. Ultimately, this is changing code that you wrote and will probably maintain. So I'd leave the final decision for that tradeoff to you. I'm just injecting some thoughts and numbers. :) -Peff