On 10/23/23 9:17 AM, Gabriel Krisman Bertazi wrote: > Jens Axboe <axboe@xxxxxxxxx> writes: > >> We could race with SQ thread exit, and if we do, we'll hit a NULL pointer >> dereference. Park the SQPOLL thread while getting the task cpu and pid for >> fdinfo, this ensures we have a stable view of it. >> >> Cc: stable@xxxxxxxxxxxxxxx >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218032 >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >> >> --- >> >> diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c >> index c53678875416..cd2a0c6b97c4 100644 >> --- a/io_uring/fdinfo.c >> +++ b/io_uring/fdinfo.c >> @@ -53,7 +53,6 @@ static __cold int io_uring_show_cred(struct seq_file *m, unsigned int id, >> __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f) >> { >> struct io_ring_ctx *ctx = f->private_data; >> - struct io_sq_data *sq = NULL; >> struct io_overflow_cqe *ocqe; >> struct io_rings *r = ctx->rings; >> unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1; >> @@ -64,6 +63,7 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f) >> unsigned int cq_shift = 0; >> unsigned int sq_shift = 0; >> unsigned int sq_entries, cq_entries; >> + int sq_pid = -1, sq_cpu = -1; >> bool has_lock; >> unsigned int i; >> >> @@ -143,13 +143,18 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f) >> has_lock = mutex_trylock(&ctx->uring_lock); >> >> if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { >> - sq = ctx->sq_data; >> - if (!sq->thread) >> - sq = NULL; >> + struct io_sq_data *sq = ctx->sq_data; >> + >> + io_sq_thread_park(sq); >> + if (sq->thread) { >> + sq_pid = task_pid_nr(sq->thread); >> + sq_cpu = task_cpu(sq->thread); >> + } >> + io_sq_thread_unpark(sq); > > Jens, > > io_sq_thread_park will try to wake the sqpoll, which is, at least, > unnecessary. But I'm thinking we don't want to expose the ability to > schedule the sqpoll from procfs, which can be done by any unrelated > process. > > To solve the bug, it should be enough to synchronize directly on > sqd->lock, preventing sq->thread from going away inside the if leg. > Granted, it is might take longer if the sqpoll is busy, but reading > fdinfo is not supposed to be fast. Alternatively, don't call > wake_process in this case? I did think about that but just went with the exported API. But you are right, it's a bit annoying that it'd also wake the thread, in case it was idle. Probably mostly cosmetic, but we may as well just stick with grabbing the sqd mutex. I'll send a v2. -- Jens Axboe