> On Aug 4, 2021, at 1:00 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: > > Nadav correctly reports that we have a race between a worker exiting, > and new work being queued. This can lead to work being queued behind > an existing worker that could be sleeping on an event before it can > run to completion, and hence introducing potential big latency gaps > if we hit this race condition: > > cpu0 cpu1 > ---- ---- > io_wqe_worker() > schedule_timeout() > // timed out > io_wqe_enqueue() > io_wqe_wake_worker() > // work_flags & IO_WQ_WORK_CONCURRENT > io_wqe_activate_free_worker() > io_worker_exit() > > Fix this by having the exiting worker go through the normal decrement > of a running worker, which will spawn a new one if needed. > > The free worker activation is modified to only return success if we > were able to find a sleeping worker - if not, we keep looking through > the list. If we fail, we create a new worker as per usual. > > Cc: stable@xxxxxxxxxxxxxxx > Link: https://lore.kernel.org/io-uring/BFF746C0-FEDE-4646-A253-3021C57C26C9@xxxxxxxxx/ > Reported-by: Nadav Amit <nadav.amit@xxxxxxxxx> > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Tested-by: Nadav Amit <nadav.amit@xxxxxxxxx> Thanks!