The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 45500dc4e01c167ee063f3dcc22f51ced5b2b1e9 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2023090909-unnatural-giddiness-7f7a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: 45500dc4e01c ("io_uring: break out of iowq iopoll on teardown") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 45500dc4e01c167ee063f3dcc22f51ced5b2b1e9 Mon Sep 17 00:00:00 2001 From: Pavel Begunkov <asml.silence@xxxxxxxxx> Date: Thu, 7 Sep 2023 13:50:07 +0100 Subject: [PATCH] io_uring: break out of iowq iopoll on teardown io-wq will retry iopoll even when it failed with -EAGAIN. If that races with task exit, which sets TIF_NOTIFY_SIGNAL for all its workers, such workers might potentially infinitely spin retrying iopoll again and again and each time failing on some allocation / waiting / etc. Don't keep spinning if io-wq is dying. Fixes: 561fb04a6a225 ("io_uring: replace workqueue usage with io-wq") Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 62f345587df5..1ecc8c748768 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -174,6 +174,16 @@ static void io_worker_ref_put(struct io_wq *wq) complete(&wq->worker_done); } +bool io_wq_worker_stopped(void) +{ + struct io_worker *worker = current->worker_private; + + if (WARN_ON_ONCE(!io_wq_current_is_worker())) + return true; + + return test_bit(IO_WQ_BIT_EXIT, &worker->wq->state); +} + static void io_worker_cancel_cb(struct io_worker *worker) { struct io_wq_acct *acct = io_wq_get_acct(worker); diff --git a/io_uring/io-wq.h b/io_uring/io-wq.h index 06d9ca90c577..2b2a6406dd8e 100644 --- a/io_uring/io-wq.h +++ b/io_uring/io-wq.h @@ -52,6 +52,7 @@ void io_wq_hash_work(struct io_wq_work *work, void *val); int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask); int io_wq_max_workers(struct io_wq *wq, int *new_count); +bool io_wq_worker_stopped(void); static inline bool io_wq_is_hashed(struct io_wq_work *work) { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 0f0ba31c3850..58d8dd34a45f 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1975,6 +1975,8 @@ void io_wq_submit_work(struct io_wq_work *work) if (!needs_poll) { if (!(req->ctx->flags & IORING_SETUP_IOPOLL)) break; + if (io_wq_worker_stopped()) + break; cond_resched(); continue; }