On 1/29/25 10:36 PM, Max Kellermann wrote: > On Thu, Jan 30, 2025 at 12:41?AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: >> Ok, then it's an architectural problem and needs more serious >> reengineering, e.g. of how work items are stored and grabbed > > Rough unpolished idea: I was thinking about having multiple work > lists, each with its own spinlock (separate cache line), and each > io-wq thread only uses one of them, while the submitter round-robins > through the lists. Pending work would certainly need better spreading than just the two classes we have now. One thing to keep in mind is that the design of io-wq is such that it's quite possible to have N work items pending and just a single thread serving all of them. If the io-wq thread doesn't go to sleep, it will keep processing work units. This is done for efficiency reasons, and to avoid a proliferation of io-wq threads when it's not going to be beneficial. This means than when you queue a work item, it's not easy to pick an appropriate io-wq thread upfront, and generally the io-wq thread itself will pick its next work item at the perfect time - when it doesn't have anything else to do, or finished the existing work. This should be kept in mind for making io-wq scale better. -- Jens Axboe