Jens Axboe <axboe@xxxxxxxxx> writes: > A previous commit added a trylock for getting the SQPOLL thread info via > fdinfo, but this introduced a regression where we often fail to get it if > the thread is busy. For that case, we end up not printing the current CPU > and PID info. > > Rather than rely on this lock, just print the pid we already stored in > the io_sq_data struct, and ensure we update the current CPU every time we > are going to sleep. The latter won't potentially be 100% accurate, but > that wasn't the case before either as the task can get migrated at any > time unless it has been pinned at creation time. > > We retain keeping the io_sq_data dereference inside the ctx->uring_lock, > as it has always been, as destruction of the thread and data happen below > that. We could make this RCU safe, but there's little point in doing that. > > With this, we always print the last valid information we had, rather than > have spurious outputs with missing information. > > Fixes: 7644b1a1c9a7 ("io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid") > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > --- > > v2: actually remember to use the cached values... also update ->sq_cpu > when we initially set it up, if it's not pinned to a given CPU. > > diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c > index f04a43044d91..976e9500f651 100644 > --- a/io_uring/fdinfo.c > +++ b/io_uring/fdinfo.c > @@ -145,13 +145,8 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f) > if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { > struct io_sq_data *sq = ctx->sq_data; > > - if (mutex_trylock(&sq->lock)) { > - if (sq->thread) { > - sq_pid = task_pid_nr(sq->thread); > - sq_cpu = task_cpu(sq->thread); > - } > - mutex_unlock(&sq->lock); > - } > + sq_pid = sq->task_pid; > + sq_cpu = sq->sq_cpu; > } > > seq_printf(m, "SqThread:\t%d\n", sq_pid); > diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c > index bd6c2c7959a5..ecb00322a4e5 100644 > --- a/io_uring/sqpoll.c > +++ b/io_uring/sqpoll.c > @@ -229,10 +229,12 @@ static int io_sq_thread(void *data) > snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid); > set_task_comm(current, buf); > > - if (sqd->sq_cpu != -1) > + if (sqd->sq_cpu != -1) { > set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu)); > - else > + } else { > set_cpus_allowed_ptr(current, cpu_online_mask); > + sqd->sq_cpu = task_cpu(current); > + } > > mutex_lock(&sqd->lock); > while (1) { > @@ -291,6 +293,7 @@ static int io_sq_thread(void *data) > } > > if (needs_sched) { > + sqd->sq_cpu = task_cpu(current); Don't you also need to update sqd->sq_cpu in io_sqd_handle_event before releasing the lock? sqpoll might get migrated after the following schedule and then parked, in which case sqd->sq_cpu will not be up to date for a while. Other than that, I think the patch is fine. > mutex_unlock(&sqd->lock); > schedule(); > mutex_lock(&sqd->lock); -- Gabriel Krisman Bertazi