On 10/23/23 16:27, Jens Axboe wrote:
On 10/23/23 9:17 AM, Gabriel Krisman Bertazi wrote:
Jens Axboe <axboe@xxxxxxxxx> writes:
We could race with SQ thread exit, and if we do, we'll hit a NULL pointer
dereference. Park the SQPOLL thread while getting the task cpu and pid for
fdinfo, this ensures we have a stable view of it.
Cc: stable@xxxxxxxxxxxxxxx
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218032
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
---
diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
index c53678875416..cd2a0c6b97c4 100644
--- a/io_uring/fdinfo.c
+++ b/io_uring/fdinfo.c
@@ -53,7 +53,6 @@ static __cold int io_uring_show_cred(struct seq_file *m, unsigned int id,
__cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
{
struct io_ring_ctx *ctx = f->private_data;
- struct io_sq_data *sq = NULL;
struct io_overflow_cqe *ocqe;
struct io_rings *r = ctx->rings;
unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1;
@@ -64,6 +63,7 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
unsigned int cq_shift = 0;
unsigned int sq_shift = 0;
unsigned int sq_entries, cq_entries;
+ int sq_pid = -1, sq_cpu = -1;
bool has_lock;
unsigned int i;
@@ -143,13 +143,18 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
has_lock = mutex_trylock(&ctx->uring_lock);
if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
- sq = ctx->sq_data;
- if (!sq->thread)
- sq = NULL;
+ struct io_sq_data *sq = ctx->sq_data;
+
+ io_sq_thread_park(sq);
+ if (sq->thread) {
+ sq_pid = task_pid_nr(sq->thread);
+ sq_cpu = task_cpu(sq->thread);
+ }
+ io_sq_thread_unpark(sq);
Jens,
io_sq_thread_park will try to wake the sqpoll, which is, at least,
unnecessary. But I'm thinking we don't want to expose the ability to
schedule the sqpoll from procfs, which can be done by any unrelated
process.
To solve the bug, it should be enough to synchronize directly on
sqd->lock, preventing sq->thread from going away inside the if leg.
Granted, it is might take longer if the sqpoll is busy, but reading
fdinfo is not supposed to be fast. Alternatively, don't call
wake_process in this case?
I did think about that but just went with the exported API. But you are
right, it's a bit annoying that it'd also wake the thread, in case it
Waking it up is not a problem but without parking sq thread won't drop
the lock until it's time to sleep, which might be pretty long leaving
the /proc read stuck on the lock uninterruptibly.
Aside from parking vs lock, there is a lock inversion now:
proc read | SQPOLL
|
try_lock(ring) // success |
| woken up
| lock(sqd); // success
lock(sqd); // stuck |
| try to submit requests
| -- lock(ring); // stuck
was idle. Probably mostly cosmetic, but we may as well just stick with
grabbing the sqd mutex. I'll send a v2.
--
Pavel Begunkov