On 9/8/21 12:21 PM, Hao Xu wrote: > 在 2021/9/7 下午2:48, Hao Xu 写道: >> 在 2021/9/7 上午3:04, Pavel Begunkov 写道: >>> On 9/3/21 12:00 PM, Hao Xu wrote: >>>> For operations like accept, multishot is a useful feature, since we can >>>> reduce a number of accept sqe. Let's integrate it to fast poll, it may >>>> be good for other operations in the future. >>>> >>>> Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx> >>>> --- >>>> fs/io_uring.c | 15 ++++++++++++--- >>>> 1 file changed, 12 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>>> index d6df60c4cdb9..dae7044e0c24 100644 >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -5277,8 +5277,15 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) >>>> return; >>>> } >>>> - hash_del(&req->hash_node); >>>> - io_poll_remove_double(req); >>>> + if (READ_ONCE(apoll->poll.canceled)) >>>> + apoll->poll.events |= EPOLLONESHOT; >>>> + if (apoll->poll.events & EPOLLONESHOT) { >>>> + hash_del(&req->hash_node); >>>> + io_poll_remove_double(req); >>>> + } else { >>>> + add_wait_queue(apoll->poll.head, &apoll->poll.wait); >>> >>> It looks like it does both io_req_task_submit() and adding back >>> to the wq, so io_issue_sqe() may be called in parallel with >>> io_async_task_func(). If so, there will be tons of all kind of >>> races. >> IMHO, io_async_task_func() is called in original context one by >> one(except PF_EXITING is set, it is also called in system-wq), so >> shouldn't be parallel case there. > ping... fwiw, the case we're talking about: CPU0 | CPU1 io_async_task_func() | -> add_wait_queue(); | -> io_req_task_submit(); | /* no tw run happened in between */ | io_async_task_func() | --> io_req_task_submit() We called io_req_task_submit() twice without running tw in-between, both of the calls use the same req->io_task_work.node field in the request for accounting, and so the second call will screw tctx->task_list and not only by not considering that req->io_task_work.node is already taken/enqueued. io_req_task_work_add() { wq_list_add_tail(&req->io_task_work.node, &tctx->task_list); } >>> >>>> + } >>>> + >>>> spin_unlock(&ctx->completion_lock); >>>> if (!READ_ONCE(apoll->poll.canceled)) >>>> @@ -5366,7 +5373,7 @@ static int io_arm_poll_handler(struct io_kiocb *req) >>>> struct io_ring_ctx *ctx = req->ctx; >>>> struct async_poll *apoll; >>>> struct io_poll_table ipt; >>>> - __poll_t ret, mask = EPOLLONESHOT | POLLERR | POLLPRI; >>>> + __poll_t ret, mask = POLLERR | POLLPRI; >>>> int rw; >>>> if (!req->file || !file_can_poll(req->file)) >>>> @@ -5388,6 +5395,8 @@ static int io_arm_poll_handler(struct io_kiocb *req) >>>> rw = WRITE; >>>> mask |= POLLOUT | POLLWRNORM; >>>> } >>>> + if (!(req->flags & REQ_F_APOLL_MULTISHOT)) >>>> + mask |= EPOLLONESHOT; >>>> /* if we can't nonblock try, then no point in arming a poll handler */ >>>> if (!io_file_supports_nowait(req, rw)) >>>> >>> > -- Pavel Begunkov