On 8/6/21 7:14 AM, Mike Galbraith wrote: > On Thu, 2021-08-05 at 18:42 +0200, Sebastian Andrzej Siewior wrote: >> >> There was throughput regression in RT compared to previous releases >> (without this series). The regression was (based on my testing) only >> visible in hackbench and was addressed by adding adaptiv spinning to >> RT-mutex. With that we almost back to what we had before :) > > Numbers on my box say a throughput regression remains (silly fork bomb > scenario.. yawn), which can be recouped by either turning on all > SL[AU]B features or converting the list_lock to a raw lock. I'm surprised you can still do that raw lock in v3/v4 because there's now a path where get_partial_node() takes the list_lock and can call put_cpu_partial() which takes the local_lock. But seems your results below indicate that this was without CONFIG_SLUB_CPU_PARTIAL so that would still work. > They also > seem to be saying that if you turned on PREEMPT_RT because you care > about RT performance first and foremost (gee), you'll do neither of > those, because either will eliminate an RT performance progression. That was my assumption, that there would be some tradeoff and RT is willing to sacrifice some throughput here... which should be only visible if your benchmark is close to slab microbenchmark, as hackbench is. Thanks again!