On 8/11/21 12:28 PM, Pavel Begunkov wrote: > With some tricks, we can avoid refcounting in most of the cases and > so save on atomics. 1-2 are simple preparations and 3-4 are the meat. > 5/5 is a hint to the compiler, which stopped to similarly optimise it > as is. > > Jens tried out a prototype before, apparently it gave ~3% win for > the default read test. Not much has changed since then, so I'd > expect same result, and also hope that it should be of even greater > benefit to multithreaded workloads. > > The previous version had a flaw, so it was decided to move all > completions out of IRQ and base on that assumption. On top of > io_uring-irq branch. This is really nice, both in terms of how the series is laid out, but also the reasoning behind it. I can't shoot any immediate holes in it, let's get it queued for 5.15. -- Jens Axboe