On 9/8/21 1:03 PM, Pavel Begunkov wrote: > On 9/8/21 12:21 PM, Hao Xu wrote: >> 在 2021/9/7 下午2:48, Hao Xu 写道: >>> 在 2021/9/7 上午3:04, Pavel Begunkov 写道: >>>> On 9/3/21 12:00 PM, Hao Xu wrote: >>>>> For operations like accept, multishot is a useful feature, since we can >>>>> reduce a number of accept sqe. Let's integrate it to fast poll, it may >>>>> be good for other operations in the future. >>>>> >>>>> Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx> >>>>> --- >>>>> fs/io_uring.c | 15 ++++++++++++--- >>>>> 1 file changed, 12 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>>>> index d6df60c4cdb9..dae7044e0c24 100644 >>>>> --- a/fs/io_uring.c >>>>> +++ b/fs/io_uring.c >>>>> @@ -5277,8 +5277,15 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) >>>>> return; >>>>> } >>>>> - hash_del(&req->hash_node); >>>>> - io_poll_remove_double(req); >>>>> + if (READ_ONCE(apoll->poll.canceled)) >>>>> + apoll->poll.events |= EPOLLONESHOT; >>>>> + if (apoll->poll.events & EPOLLONESHOT) { >>>>> + hash_del(&req->hash_node); >>>>> + io_poll_remove_double(req); >>>>> + } else { >>>>> + add_wait_queue(apoll->poll.head, &apoll->poll.wait); >>>> >>>> It looks like it does both io_req_task_submit() and adding back >>>> to the wq, so io_issue_sqe() may be called in parallel with >>>> io_async_task_func(). If so, there will be tons of all kind of >>>> races. >>> IMHO, io_async_task_func() is called in original context one by >>> one(except PF_EXITING is set, it is also called in system-wq), so >>> shouldn't be parallel case there. >> ping... > > fwiw, the case we're talking about: > > CPU0 | CPU1 > io_async_task_func() | > -> add_wait_queue(); | > -> io_req_task_submit(); | > /* no tw run happened in between */ > | io_async_task_func() > | --> io_req_task_submit() > > We called io_req_task_submit() twice without running tw in-between, > both of the calls use the same req->io_task_work.node field in the > request for accounting, and so the second call will screw > tctx->task_list and not only by not considering that > req->io_task_work.node is already taken/enqueued. > > io_req_task_work_add() { > wq_list_add_tail(&req->io_task_work.node, &tctx->task_list); > } fwiw, can be just 1 CPU, just io_req_task_work_add(); io_req_task_work_add(); task_work_run(); // first one is buggy in current constraints. > >>>> >>>>> + } >>>>> + >>>>> spin_unlock(&ctx->completion_lock); >>>>> if (!READ_ONCE(apoll->poll.canceled)) >>>>> @@ -5366,7 +5373,7 @@ static int io_arm_poll_handler(struct io_kiocb *req) >>>>> struct io_ring_ctx *ctx = req->ctx; >>>>> struct async_poll *apoll; >>>>> struct io_poll_table ipt; >>>>> - __poll_t ret, mask = EPOLLONESHOT | POLLERR | POLLPRI; >>>>> + __poll_t ret, mask = POLLERR | POLLPRI; >>>>> int rw; >>>>> if (!req->file || !file_can_poll(req->file)) >>>>> @@ -5388,6 +5395,8 @@ static int io_arm_poll_handler(struct io_kiocb *req) >>>>> rw = WRITE; >>>>> mask |= POLLOUT | POLLWRNORM; >>>>> } >>>>> + if (!(req->flags & REQ_F_APOLL_MULTISHOT)) >>>>> + mask |= EPOLLONESHOT; >>>>> /* if we can't nonblock try, then no point in arming a poll handler */ >>>>> if (!io_file_supports_nowait(req, rw)) >>>>> >>>> >> > -- Pavel Begunkov