On 12/20/22 11:06 AM, Pavel Begunkov wrote: > On 12/20/22 17:58, Pavel Begunkov wrote: >> NOT FOR INCLUSION, needs some ring poll workarounds >> >> Flush completions is done either from the submit syscall or by the >> task_work, both are in the context of the submitter task, and when it >> goes for a single threaded rings like implied by ->task_complete, there >> won't be any waiters on ->cq_wait but the master task. That means that >> there can be no tasks sleeping on cq_wait while we run >> __io_submit_flush_completions() and so waking up can be skipped. > > Not trivial to benchmark as we need something to emulate a task_work > coming in the middle of waiting. I used the diff below to complete nops > in tw and removed preliminary tw runs for the "in the middle of waiting" > part. IORING_SETUP_SKIP_CQWAKE controls whether we use optimisation or > not. > > It gets around 15% more IOPS (6769526 -> 7803304), which correlates > to 10% of wakeup cost in profiles. Another interesting part is that > waitqueues are excessive for our purposes and we can replace cq_wait > with something less heavier, e.g. atomic bit set I was thinking something like that the other day, for most purposes the wait infra is too heavy handed for our case. If we exclude poll for a second, everything else is internal and eg doesn't need IRQ safe locking at all. That's just one part of it. But I didn't have a good idea for the poll() side of things, which would be required to make some progress there. -- Jens Axboe