Re: [PATCH] io_uring: complete request via task work in case of DEFER_TASKRUN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 16, 2023 at 12:15:20AM +0100, Pavel Begunkov wrote:
> On 4/14/23 16:42, Ming Lei wrote:
> > On Fri, Apr 14, 2023 at 04:07:52PM +0100, Pavel Begunkov wrote:
> > > On 4/14/23 14:53, Ming Lei wrote:
> > > > On Fri, Apr 14, 2023 at 02:01:26PM +0100, Pavel Begunkov wrote:
> > > > > On 4/14/23 08:53, Ming Lei wrote:
> > > > > > So far io_req_complete_post() only covers DEFER_TASKRUN by completing
> > > > > > request via task work when the request is completed from IOWQ.
> > > > > > 
> > > > > > However, uring command could be completed from any context, and if io
> > > > > > uring is setup with DEFER_TASKRUN, the command is required to be
> > > > > > completed from current context, otherwise wait on IORING_ENTER_GETEVENTS
> > > > > > can't be wakeup, and may hang forever.
> > > > > 
> > > > > fwiw, there is one legit exception, when the task is half dead
> > > > > task_work will be executed by a kthread. It should be fine as it
> > > > > locks the ctx down, but I can't help but wonder whether it's only
> > > > > ublk_cancel_queue() affected or there are more places in ublk?
> > > > 
> > > > No, it isn't.
> > > > 
> > > > It isn't triggered on nvme-pt just because command is always done
> > > > in task context.
> > > > 
> > > > And we know more uring command cases are coming.
> > > 
> > > Because all requests and cmds but ublk complete it from another
> > > task, ublk is special in this regard.
> > 
> > Not sure it is true, cause it is allowed to call io_uring_cmd_done from other
> > task technically. And it could be more friendly for driver to not limit
> > its caller in the task context. Especially we have another API of
> > io_uring_cmd_complete_in_task().
> 
> I agree that the cmd io_uring API can do better.
> 
> 
> > > I have several more not so related questions:
> > > 
> > > 1) Can requests be submitted by some other task than ->ubq_daemon?
> > 
> > Yeah, requests can be submitted by other task, but ublk driver doesn't
> > allow it because ublk driver has not knowledge when the io_uring context
> > goes away, so has to limit requests submitted from ->ubq_daemon only,
> > then use this task's information for checking if the io_uring context
> > is going to exit. When the io_uring context is dying, we need to
> > abort these uring commands(may never complete), see ublk_cancel_queue().
> > 
> > The only difference is that the uring command may never complete,
> > because one uring cmd is only completed when the associated block request
> > is coming. The situation could be improved by adding API/callback for
> > notifying io_uring exit.
> 
> Got it. And it sounds like you can use IORING_SETUP_SINGLE_ISSUER
> and possibly IORING_SETUP_DEFER_TASKRUN, if not already.

ublk driver is simple, but the userspace ublk server can be quite
complicated and need flexible setting, and we shouldn't put any limit
on userspace in theory.

> 
> 
> > > Looking at
> > > 
> > > static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
> > > {
> > >      ...
> > >      if (ubq->ubq_daemon && ubq->ubq_daemon != current)
> > >         goto out;
> > > }
> > > 
> > > ublk_queue_cmd() avoiding io_uring way of delivery and using
> > > raw task_work doesn't seem great. Especially with TWA_SIGNAL_NO_IPI.
> > 
> > Yeah, it has been in my todo list to kill task work. In ublk early time,
> 
> I see
> 
> > task work just performs better than io_uring_cmd_complete_in_task(), but
> > the gap becomes pretty small or even not visible now.
> 
> It seems a bit strange, non DEFER_TASKRUN tw is almost identical to what
> you do, see __io_req_task_work_add(). Maybe it's extra callbacks on the
> execution side.
> 
> Did you try DEFER_TASKRUN? Not sure it suits your case as there are
> limitations, but the queueing side of it, as well as execution and
> waiting are well optimised and should do better.

I tried DEFER_TASKRUN which need this fix, not see obvious IOPS boost
against IORING_SETUP_COOP_TASKRUN, which does make big difference.

> 
> 
> > > 2) What the purpose of the two lines below? I see how
> > > UBLK_F_URING_CMD_COMP_IN_TASK is used, but don't understand
> > > why it changes depending on whether it's a module or not.
> > 
> > task work isn't available in case of building ublk as module.
> 
> Ah, makes sense now, thanks
> 
> > > 3) The long comment in ublk_queue_cmd() seems quite scary.
> > > If you have a cmd / io_uring request it hold a ctx reference
> > > and is always allowed to use io_uring's task_work infra like
> > > io_uring_cmd_complete_in_task(). Why it's different for ublk?
> > 
> > The thing is that we don't know if there is io_uring request for the
> > coming blk request. UBLK_IO_FLAG_ABORTED just means that the io_uring
> > context is dead, and we can't use io_uring_cmd_complete_in_task() any
> > more.
> 
> Roughly got it, IIUC, there might not be a (valid) io_uring
> request backing this block request in the first place because of
> this aborting thing.

I am working on adding notifier cb in io_uring_try_cancel_requests(),
and looks it works. With this way, ublk server implementation can become
quite flexible and aborting becomes simpler, such as, not need limit of
single per-queue submitter any more, and I remember that spdk guys did
complain this kind of limit.


Thanks,
Ming




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux