Re: [PATCH next v1 2/2] io_uring: limit local tw done

Pavel Begunkov <asml.silence@xxxxxxxxx> · Thu, 21 Nov 2024 16:00:30 +0000

On 11/21/24 15:22, Jens Axboe wrote:
On 11/21/24 8:15 AM, Jens Axboe wrote:
I'd rather entertain NOT using llists for this in the first place, as it
gets rid of the reversing which is the main cost here. That won't change
the need for a retry list necessarily, as I think we'd be better off
with a lockless retry list still. But at least it'd get rid of the
reversing. Let me see if I can dig out that patch... Totally orthogonal
to this topic, obviously.

It's here:

https://lore.kernel.org/io-uring/20240326184615.458820-3-axboe@xxxxxxxxx/

I did improve it further but never posted it again, fwiw.
It's nice that with sth like that we're not restricted by space and be
smarter about batching, e.g. splitting nr_tw into buckets. However, the
overhead of spinlock could be very hard if there is contention. With
block it's more uniform which CPU tw comes from, but with network it
could be much more random. That's what Dylan measured back than, and
quite a similar situation that you've seen yourself before is with
socket locks.

Another option is to try out how a lockless list (instead of stack)
with double cmpxchg would perform.

--
Pavel Begunkov