On 1/19/24 01:23, Pavel Begunkov wrote:
The put side of the percpu bio caching is mainly targeting completions in the hard irq context, but the context is not guaranteed so we guard against those cases by switching interrupts off. Disabling interrupts while they're already disabled is supposed to be fast, but profiling shows it's far from perfect. Instead, we can infer the interrupt state from in_hardirq(), which is just a fast var read, and fall back to the normal bio_free() otherwise. With that, the caching doesn't cover in softirq/task completions anymore, but that should be just fine, we have never measured if caching brings anything in those scenarios. Profiling indicates that the bio_put() cost is reduced by ~3.5 times (1.76% -> 0.49%), and and throughput of CPU bound benchmarks improve by around 1% (t/io_uring with high QD and several drives).
Let me know if there are any concerns with the patch -- Pavel Begunkov