On 8/18/21 5:42 AM, Pavel Begunkov wrote: > In essence, it's about two features. The first one is implemented by > 1-2 and saves ->uring_lock lock/unlock in a single call of > tctx_task_work(). Should be useful for links, apolls and BPF requests > at some moment. > > The second feature (3/3) is batching freeing and completing of > IRQ-based read/write requests. > > Haven't got numbers yet, but just throwing it for public discussion. I ran some numbers and it looks good to me, it's a nice boost for the IRQ completions. It's funny how the initial move to task_work for IRQ completions took a small hit, but there's so many optimizations that it unlocks that it's already better than before. I'd like to apply 1/3 for now, but it depends on both master and for-5.15/io_uring. Hence I think it'd be better to defer that one until after the initial batch has gone in. For the batched locking, the principle is sound and measures out to be a nice win. But I have a hard time getting over the passed lock state, I do wonder if there's a cleaner way to accomplish this... -- Jens Axboe