On Sun, Apr 16, 2023 at 12:15:20AM +0100, Pavel Begunkov wrote: > On 4/14/23 16:42, Ming Lei wrote: > > On Fri, Apr 14, 2023 at 04:07:52PM +0100, Pavel Begunkov wrote: > > > On 4/14/23 14:53, Ming Lei wrote: > > > > On Fri, Apr 14, 2023 at 02:01:26PM +0100, Pavel Begunkov wrote: > > > > > On 4/14/23 08:53, Ming Lei wrote: > > > > > > So far io_req_complete_post() only covers DEFER_TASKRUN by completing > > > > > > request via task work when the request is completed from IOWQ. > > > > > > > > > > > > However, uring command could be completed from any context, and if io > > > > > > uring is setup with DEFER_TASKRUN, the command is required to be > > > > > > completed from current context, otherwise wait on IORING_ENTER_GETEVENTS > > > > > > can't be wakeup, and may hang forever. > > > > > > > > > > fwiw, there is one legit exception, when the task is half dead > > > > > task_work will be executed by a kthread. It should be fine as it > > > > > locks the ctx down, but I can't help but wonder whether it's only > > > > > ublk_cancel_queue() affected or there are more places in ublk? > > > > > > > > No, it isn't. > > > > > > > > It isn't triggered on nvme-pt just because command is always done > > > > in task context. > > > > > > > > And we know more uring command cases are coming. > > > > > > Because all requests and cmds but ublk complete it from another > > > task, ublk is special in this regard. > > > > Not sure it is true, cause it is allowed to call io_uring_cmd_done from other > > task technically. And it could be more friendly for driver to not limit > > its caller in the task context. Especially we have another API of > > io_uring_cmd_complete_in_task(). > > I agree that the cmd io_uring API can do better. > > > > > I have several more not so related questions: > > > > > > 1) Can requests be submitted by some other task than ->ubq_daemon? > > > > Yeah, requests can be submitted by other task, but ublk driver doesn't > > allow it because ublk driver has not knowledge when the io_uring context > > goes away, so has to limit requests submitted from ->ubq_daemon only, > > then use this task's information for checking if the io_uring context > > is going to exit. When the io_uring context is dying, we need to > > abort these uring commands(may never complete), see ublk_cancel_queue(). > > > > The only difference is that the uring command may never complete, > > because one uring cmd is only completed when the associated block request > > is coming. The situation could be improved by adding API/callback for > > notifying io_uring exit. > > Got it. And it sounds like you can use IORING_SETUP_SINGLE_ISSUER > and possibly IORING_SETUP_DEFER_TASKRUN, if not already. ublk driver is simple, but the userspace ublk server can be quite complicated and need flexible setting, and we shouldn't put any limit on userspace in theory. > > > > > Looking at > > > > > > static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) > > > { > > > ... > > > if (ubq->ubq_daemon && ubq->ubq_daemon != current) > > > goto out; > > > } > > > > > > ublk_queue_cmd() avoiding io_uring way of delivery and using > > > raw task_work doesn't seem great. Especially with TWA_SIGNAL_NO_IPI. > > > > Yeah, it has been in my todo list to kill task work. In ublk early time, > > I see > > > task work just performs better than io_uring_cmd_complete_in_task(), but > > the gap becomes pretty small or even not visible now. > > It seems a bit strange, non DEFER_TASKRUN tw is almost identical to what > you do, see __io_req_task_work_add(). Maybe it's extra callbacks on the > execution side. > > Did you try DEFER_TASKRUN? Not sure it suits your case as there are > limitations, but the queueing side of it, as well as execution and > waiting are well optimised and should do better. I tried DEFER_TASKRUN which need this fix, not see obvious IOPS boost against IORING_SETUP_COOP_TASKRUN, which does make big difference. > > > > > 2) What the purpose of the two lines below? I see how > > > UBLK_F_URING_CMD_COMP_IN_TASK is used, but don't understand > > > why it changes depending on whether it's a module or not. > > > > task work isn't available in case of building ublk as module. > > Ah, makes sense now, thanks > > > > 3) The long comment in ublk_queue_cmd() seems quite scary. > > > If you have a cmd / io_uring request it hold a ctx reference > > > and is always allowed to use io_uring's task_work infra like > > > io_uring_cmd_complete_in_task(). Why it's different for ublk? > > > > The thing is that we don't know if there is io_uring request for the > > coming blk request. UBLK_IO_FLAG_ABORTED just means that the io_uring > > context is dead, and we can't use io_uring_cmd_complete_in_task() any > > more. > > Roughly got it, IIUC, there might not be a (valid) io_uring > request backing this block request in the first place because of > this aborting thing. I am working on adding notifier cb in io_uring_try_cancel_requests(), and looks it works. With this way, ublk server implementation can become quite flexible and aborting becomes simpler, such as, not need limit of single per-queue submitter any more, and I remember that spdk guys did complain this kind of limit. Thanks, Ming