Re: [PATCH 4/4] io_uring: optimise compl locking for non-shared rings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/18/22 14:54, Jens Axboe wrote:
On 3/18/22 7:52 AM, Pavel Begunkov wrote:
When only one task submits requests, most of CQEs are expected to be
filled from that task context so we have natural serialisation. That
would mean that in those cases we don't need spinlocking around CQE
posting. One downside is that it also mean that io-wq workers can't emit
CQEs directly but should do it through the original task context using
task_works. That may hurt latency and performance and might matter much
to some workloads, but it's not a huge deal in general as io-wq is a
slow path and there is some additional merit from tw completion
batching.

Not too worried about io-wq task_work for cq filling, it is the slower
path after all. And I think we can get away with doing notifications as
it's just for CQ filling. If the task is currently waiting in
cqring_wait, then it'll get woken anyway and it will process task work.
If it's in userspace, it doesn't need a notification. That should make
it somewhat lighter than requiring using TIF_NOTIFY_SIGNAL for that.

The feature should be opted-in by the userspace by setting a new
IORING_SETUP_PRIVATE_CQ flag. It doesn't work with IOPOLL, and also for
now only the task that created a ring can submit requests to it.

I know this is a WIP, but why do we need CQ_PRIVATE? And this needs to

One reason is because of the io-wq -> tw punting, which is not optimal
for e.g. active users of IOSQE_ASYNC. The second is because the
fundamental requirement is that only one task should be submitting
requests. Was thinking about automating it, e.g. when we register
a second tctx we go through a slow path waiting for all current tw
to complete and then removing an internal and not userspace visible
CQ_PRIVATE flag.

Also, as SQPOLL task is by definition the only one submitting SQEs,
was thinking about enabling it by default for them, but didn't do
because of the io-wq / IOSQE_ASYNC.

work with registered files (and ring fd) as that is probably a bigger
win than skipping the completion_lock if you're not shared anyway.

It does work with fixed/registered files and registered io_uring fds.

In regards of "a bigger win", probably in many cases, but if you submit
a good batch at once, and completion tw batching doesn't kick in (e.g.
direct bdev read of not too high intensity), it might save
N spinlock/unlock when registered ring fd would kill only one pair of
fdget/fdput.

--
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux