Hi, Got a report on io-wq stalls, and it turned into quite the rabbit hole of fixes. There are two main things fixed by this series: 1) Single ring that has a lot of bounded vs unbounded traffic. The fix is mainly just splitting the bounded and unbounded lists, so that we never stall bounded unnecessarily. There are further cleanups possible on top of this, but that should be deferred to 5.16. 2) Workloads that have io-wq work and rely heavily on signaling to communicate between processes/threads. This can interfere with worker creation, and this is particularly troublesome if it just happens to occur with the first worker creation. In general, harden the worker creation and ensure we handle failures in terms of allocations and worker creations. -- Jens Axboe