Hi, > -----Original Message----- > From: Pavel Begunkov <asml.silence@xxxxxxxxx> > Sent: Monday, January 13, 2025 5:16 AM > To: Bui Quang Minh <minhquangbui99@xxxxxxxxx>; lizetao > <lizetao1@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx > Cc: Jens Axboe <axboe@xxxxxxxxx>; io-uring@xxxxxxxxxxxxxxx; > syzbot+3c750be01dab672c513d@xxxxxxxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH] io_uring: simplify the SQPOLL thread check when > cancelling requests > > On 1/12/25 16:14, Bui Quang Minh wrote: > ... > >>> @@ -2898,7 +2899,12 @@ static __cold void io_ring_exit_work(struct > >>> work_struct *work) > >>> if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) > >>> io_move_task_work_from_local(ctx); > >>> > >>> - while (io_uring_try_cancel_requests(ctx, NULL, true)) > >>> + /* > >>> + * Even if SQPOLL thread reaches this path, don't force > >>> + * iopoll here, let the io_uring_cancel_generic handle > >>> + * it. > >> > >> Just curious, will sq_thread enter this io_ring_exit_work path? > > > > AFAIK, yes. The SQPOLL thread is created with create_io_thread, this function > creates a new task with CLONE_FILES. So all the open files is shared. There will > be case that the parent closes its io_uring file and SQPOLL thread become the > only owner of that file. So it can reach this path when terminating. > > The function is run by a separate kthread, the sqpoll task doesn't call it directly. I also think so, the sqpoll task may not call io_ring_exit_work(). > > [...] > >>>> io_uring, > >>> - cancel_all); > >>> + cancel_all, > >>> + true); > >>> } > >>> > >>> if (loop) { > >>> -- > >>> 2.43.0 > >>> > >> > >> Maybe you miss something, just like Begunkov mentioned in your last > version patch: > >> > >> io_uring_cancel_generic > >> WARN_ON_ONCE(sqd && sqd->thread != current); > >> > >> This WARN_ON_ONCE will never be triggered, so you could remove it. > > > > He meant that we don't need to annotate sqd->thread access in this debug > check. The io_uring_cancel_generic function has assumption that the sgd is not > NULL only when it's called by a SQPOLL thread. So the check means to ensure > this assumption. A data race happens only when this function is called by other > tasks than the SQPOLL thread, so it can race with the SQPOLL termination. > However, the sgd is not NULL only when this function is called by SQPOLL > thread. In normal situation following the io_uring_cancel_generic's assumption, > the data race cannot happen. And in case the assumption is broken, the > warning almost always is triggered even if data race happens. So we can ignore > the race here. > > Right. And that's the point of warnings, they're supposed to be untriggerable, > otherwise there is a problem with the code that needs to be fixed. Okay, I understand the meaning of this WARN. > > -- > Pavel Begunkov --- Li Zetao