Most of the justification for this work is done in the commit messages themselves, but the tldr is that there are various bits of functionality that io_uring needs that workqueues don't (and can't) provide. Hence this adds a small replacement thread pool implementation that caters to the needs of io_uring, both current and future ones. Hopefully the sched core changes are palatable. io-wq uses the same sched in/out hooks as workqueue, and if the task isn't either a workqueue or io-wq worker, there should be no extra overhead there. Patches are on top of my for-5.5/io_uring branch, and can also be found here: http://git.kernel.dk/cgit/linux-block/log/?h=for-5.5/io_uring-wq This passes io_uring IO testing and repeated runs of the liburing regressions suite, and passes tests that would previously deadlock. fs/Kconfig | 3 + fs/Makefile | 1 + fs/io-wq.c | 790 ++++++++++++++++++++++++++++++++++++++++++ fs/io-wq.h | 55 +++ fs/io_uring.c | 402 +++++---------------- include/linux/sched.h | 1 + init/Kconfig | 1 + kernel/sched/core.c | 16 +- 8 files changed, 948 insertions(+), 321 deletions(-) -- Jens Axboe