On 15/12/2020 02:28, Xiaoguang Wang wrote: > hi, > >> On 14/12/2020 15:49, Xiaoguang Wang wrote: >>> io_iopoll_complete() does not hold completion_lock to complete polled >>> io, so in io_wq_submit_work(), we can not call io_req_complete() directly, >>> to complete polled io, otherwise there maybe concurrent access to cqring, >>> defer_list, etc, which is not safe. Commit dad1b1242fd5 ("io_uring: always >>> let io_iopoll_complete() complete polled io") has fixed this issue, but >>> Pavel reported that IOPOLL apart from rw can do buf reg/unreg requests( >>> IORING_OP_PROVIDE_BUFFERS or IORING_OP_REMOVE_BUFFERS), so the fix is >>> not good. >>> >>> Given that io_iopoll_complete() is always called under uring_lock, so here >>> for polled io, we can also get uring_lock to fix this issue. >> >> One thing I don't like is that io_wq_submit_work() won't be able to >> publish an event while someone polling io_uring_enter(ENTER_GETEVENTS), >> that's because both take the lock. The problem is when the poller waits >> for an event that is currently in io-wq (i.e. io_wq_submit_work()). >> The polling loop will eventually exit, so that's not a deadlock, but >> latency,etc. would be huge. > In this patch, we just hold uring_lock for polled io in error path, so I think > normally it maybe not an issue, and seems that the critical section is not > that big, so it also may not result in huge latecy. To give a bit of a context, it breaks out of the io_iopoll_check()'s loop with need_resched(). Anyway, fair enough, probably should be dealt separately. > I also noticed that current codes also hold uring_lock in io_wq_submit_work() > call chain: > ==> io_wq_submit_work() > ====> io_issue_sqe() > ======> io_provide_buffers() > ========> io_ring_submit_lock(ctx, !force_nonblock); Yep, that's not good either, though I care more about rw. > > Regards, > Xiaoguang Wang >> >>> >>> Fixes: dad1b1242fd5 ("io_uring: always let io_iopoll_complete() complete polled io") >>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@xxxxxxxxxxxxxxxxx> >>> --- >>> fs/io_uring.c | 25 +++++++++++++++---------- >>> 1 file changed, 15 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index f53356ced5ab..eab3d2b7d232 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -6354,19 +6354,24 @@ static struct io_wq_work *io_wq_submit_work(struct io_wq_work *work) >>> } >>> if (ret) { >>> + bool iopoll_enabled = req->ctx->flags & IORING_SETUP_IOPOLL; >>> + >>> /* >>> - * io_iopoll_complete() does not hold completion_lock to complete >>> - * polled io, so here for polled io, just mark it done and still let >>> - * io_iopoll_complete() complete it. >>> + * io_iopoll_complete() does not hold completion_lock to complete polled >>> + * io, so here for polled io, we can not call io_req_complete() directly, >>> + * otherwise there maybe concurrent access to cqring, defer_list, etc, >>> + * which is not safe. Given that io_iopoll_complete() is always called >>> + * under uring_lock, so here for polled io, we also get uring_lock to >>> + * complete it. >>> */ >>> - if (req->ctx->flags & IORING_SETUP_IOPOLL) { >>> - struct kiocb *kiocb = &req->rw.kiocb; >>> + if (iopoll_enabled) >>> + mutex_lock(&req->ctx->uring_lock); >>> - kiocb_done(kiocb, ret, NULL); >>> - } else { >>> - req_set_fail_links(req); >>> - io_req_complete(req, ret); >>> - } >>> + req_set_fail_links(req); >>> + io_req_complete(req, ret); >>> + >>> + if (iopoll_enabled) >>> + mutex_unlock(&req->ctx->uring_lock); >>> } >>> return io_steal_work(req); >>> >> -- Pavel Begunkov