On 3/11/23 1:56 PM, Pavel Begunkov wrote: > On 3/10/23 20:38, Jens Axboe wrote: >> On 3/10/23 1:11 PM, Breno Leitao wrote: >>> Right now io_wq allocates one io_wqe per NUMA node. As io_wq is now >>> bound to a task, the task basically uses only the NUMA local io_wqe, and >>> almost never changes NUMA nodes, thus, the other wqes are mostly >>> unused. >> >> What if the task gets migrated to a different node? Unless the task >> is pinned to a node/cpumask that is local to that node, it will move >> around freely. > > In which case we're screwed anyway and not only for the slow io-wq > path but also with the hot path as rings and all io_uring ctx and > requests won't be migrated locally. Oh agree, not saying it's ideal, but it can happen. What if you deliberately use io-wq to offload work and you set it to another mask? That one I supposed we could handle by allocating based on the set mask. Two nodes might be more difficult... For most things this won't really matter as io-wq is a slow path for that, but there might very well be cases that deliberately offload. > It's also curious whether io-wq workers will get migrated > automatically as they are a part of the thread group. They certainly will, unless affinitized otherwise. -- Jens Axboe