On 3/10/23 20:38, Jens Axboe wrote:
On 3/10/23 1:11 PM, Breno Leitao wrote:
Right now io_wq allocates one io_wqe per NUMA node. As io_wq is now
bound to a task, the task basically uses only the NUMA local io_wqe, and
almost never changes NUMA nodes, thus, the other wqes are mostly
unused.
What if the task gets migrated to a different node? Unless the task
is pinned to a node/cpumask that is local to that node, it will move
around freely.
In which case we're screwed anyway and not only for the slow io-wq
path but also with the hot path as rings and all io_uring ctx and
requests won't be migrated locally.
It's also curious whether io-wq workers will get migrated
automatically as they are a part of the thread group.
I'm not a huge fan of the per-node setup, but I think the reasonings
given in this patch are a bit too vague and we need to go a bit
deeper on what a better setup would look like.
--
Pavel Begunkov