On 6/7/21 5:26 AM, Jan Kara wrote: > Commit 545fbd0775ba ("rq-qos: fix missed wake-ups in rq_qos_throttle") > tried to fix a problem that a process could be sleeping in rq_qos_wait() > without anyone to wake it up. However the fix is not complete and the > following can still happen: > > CPU1 (waiter1) CPU2 (waiter2) CPU3 (waker) > rq_qos_wait() rq_qos_wait() > acquire_inflight_cb() -> fails > acquire_inflight_cb() -> fails > > completes IOs, inflight > decreased > prepare_to_wait_exclusive() > prepare_to_wait_exclusive() > has_sleeper = !wq_has_single_sleeper() -> true as there are two sleepers > has_sleeper = !wq_has_single_sleeper() -> true > io_schedule() io_schedule() > > Deadlock as now there's nobody to wakeup the two waiters. The logic > automatically blocking when there are already sleepers is really subtle > and the only way to make it work reliably is that we check whether there > are some waiters in the queue when adding ourselves there. That way, we > are guaranteed that at least the first process to enter the wait queue > will recheck the waiting condition before going to sleep and thus > guarantee forward progress. Applied, thanks. -- Jens Axboe