Re: [PATCH 2/3] io-wq: fix no lock protection of acct->nr_worker

Hao Xu <haoxu@xxxxxxxxxxxxxxxxx> · Sat, 7 Aug 2021 17:56:23 +0800

在 2021/8/6 下午10:27, Jens Axboe 写道:
On Thu, Aug 5, 2021 at 4:05 AM Hao Xu <haoxu@xxxxxxxxxxxxxxxxx> wrote:

There is an acct->nr_worker visit without lock protection. Think about
the case: two callers call io_wqe_wake_worker(), one is the original
context and the other one is an io-worker(by calling
io_wqe_enqueue(wqe, linked)), on two cpus paralelly, this may cause
nr_worker to be larger than max_worker.
Let's fix it by adding lock for it, and let's do nr_workers++ before
create_io_worker. There may be a edge cause that the first caller fails
to create an io-worker, but the second caller doesn't know it and then
quit creating io-worker as well:

say nr_worker = max_worker - 1
         cpu 0                        cpu 1
    io_wqe_wake_worker()          io_wqe_wake_worker()
       nr_worker < max_worker
       nr_worker++
       create_io_worker()         nr_worker == max_worker
          failed                  return
       return

But the chance of this case is very slim.

Fixes: 685fe7feedb9 ("io-wq: eliminate the need for a manager thread")
Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx>
---
  fs/io-wq.c | 17 ++++++++++++-----
  1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/io-wq.c b/fs/io-wq.c
index cd4fd4d6268f..88d0ba7be1fb 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -247,9 +247,14 @@ static void io_wqe_wake_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
         ret = io_wqe_activate_free_worker(wqe);
         rcu_read_unlock();

-       if (!ret && acct->nr_workers < acct->max_workers) {
-               atomic_inc(&acct->nr_running);
-               atomic_inc(&wqe->wq->worker_refs);
+       if (!ret) {
+               raw_spin_lock_irq(&wqe->lock);
+               if (acct->nr_workers < acct->max_workers) {
+                       atomic_inc(&acct->nr_running);
+                       atomic_inc(&wqe->wq->worker_refs);
+                       acct->nr_workers++;
+               }
+               raw_spin_unlock_irq(&wqe->lock);
                 create_io_worker(wqe->wq, wqe, acct->index);
         }
  }

There's a pretty grave bug in this patch, in that you no call
create_io_worker() unconditionally. This causes obvious problems with
misaccounting, and stalls that hit the idle timeout...

This is surely a silly mistake, I'll check this patch and the 3/3 again.