On 11/15/23 6:42 AM, Jens Axboe wrote: >> 2. Sometimes it can output, sometimes it outputs -1. >> >> The test results are as follows: >> Every 0.5s: cat /proc/9572/fdinfo/6 | grep Sq >> SqMask: 0x3 >> SqHead: 6765744 >> SqTail: 6765744 >> CachedSqHead: 6765744 >> SqThread: -1 >> SqThreadCpu: -1 >> SqBusy: 0% >> ------------------------------------------- >> Every 0.5s: cat /proc/9572/fdinfo/6 | grep Sq >> SqMask: 0x3 >> SqHead: 7348727 >> SqTail: 7348728 >> CachedSqHead: 7348728 >> SqThread: 9571 >> SqThreadCpu: 174 >> SqBusy: 95% > > Right, this is due to the uring_lock. We got rid of the main > regression, which was the new trylock for the sqd->lock, but the old > one remains. We can fix this as well for sqpoll info, but it's not a > regression from past releases, it's always been like that. > > Pavel and I discussed it yesterday, and the easy solution is to make > io_sq_data be under RCU protection. But that requires this patch > first, so we don't have to fiddle with the sqpoll task itself. I can > try and hack up the patch if you want to test it, it'd be on top of > this one and for the next kernel release rather than 6.7. Something like this, totally untested. diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c index 976e9500f651..434a21a6b653 100644 --- a/io_uring/fdinfo.c +++ b/io_uring/fdinfo.c @@ -142,11 +142,16 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f) */ has_lock = mutex_trylock(&ctx->uring_lock); - if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { - struct io_sq_data *sq = ctx->sq_data; - - sq_pid = sq->task_pid; - sq_cpu = sq->sq_cpu; + if (ctx->flags & IORING_SETUP_SQPOLL) { + struct io_sq_data *sq; + + rcu_read_lock(); + sq = READ_ONCE(ctx->sq_data); + if (sq) { + sq_pid = sq->task_pid; + sq_cpu = sq->sq_cpu; + } + rcu_read_unlock(); } seq_printf(m, "SqThread:\t%d\n", sq_pid); diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c index 65b5dbe3c850..583c76945cdf 100644 --- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -70,7 +70,7 @@ void io_put_sq_data(struct io_sq_data *sqd) WARN_ON_ONCE(atomic_read(&sqd->park_pending)); io_sq_thread_stop(sqd); - kfree(sqd); + kfree_rcu(sqd, rcu); } } @@ -313,7 +313,7 @@ static int io_sq_thread(void *data) } io_uring_cancel_generic(true, sqd); - sqd->thread = NULL; + WRITE_ONCE(sqd->thread, NULL); list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) atomic_or(IORING_SQ_NEED_WAKEUP, &ctx->rings->sq_flags); io_run_task_work(); @@ -411,7 +411,7 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx, goto err_sqpoll; } - sqd->thread = tsk; + WRITE_ONCE(sqd->thread, tsk); ret = io_uring_alloc_task_context(tsk, ctx); wake_up_new_task(tsk); if (ret) diff --git a/io_uring/sqpoll.h b/io_uring/sqpoll.h index 8df37e8c9149..0cf0c5833a27 100644 --- a/io_uring/sqpoll.h +++ b/io_uring/sqpoll.h @@ -18,6 +18,8 @@ struct io_sq_data { unsigned long state; struct completion exited; + + struct rcu_head rcu; }; int io_sq_offload_create(struct io_ring_ctx *ctx, struct io_uring_params *p); -- Jens Axboe