On 9/10/24 13:11, Felix Moessbauer wrote:
The io worker threads are userland threads that just never exit to the
userland. By that, they are also assigned to a cgroup (the group of the
creating task).
The io-wq task is not actually assigned to a cgroup. To belong to a
cgroup, its pid has to be present to the cgroup.procs of the
corresponding cgroup, which is not the case here. My understanding is
that you are just restricting the CPU affinity to follow the cpuset of
the corresponding user task that creates it. The CPU affinity (cpumask)
is just one of the many resources controlled by a cgroup. That probably
needs to be clarified.
Besides cpumask, the cpuset controller also controls the node mask of
the memory nodes allowed.
Cheers,
Longman
When creating a new io worker, this worker should inherit the cpuset
of the cgroup.
Fixes: da64d6db3bd3 ("io_uring: One wqe per wq")
Signed-off-by: Felix Moessbauer <felix.moessbauer@xxxxxxxxxxx>
---
io_uring/io-wq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index c7055a8895d7..a38f36b68060 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -1168,7 +1168,7 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
if (!alloc_cpumask_var(&wq->cpu_mask, GFP_KERNEL))
goto err;
- cpumask_copy(wq->cpu_mask, cpu_possible_mask);
+ cpuset_cpus_allowed(data->task, wq->cpu_mask);
wq->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
wq->acct[IO_WQ_ACCT_UNBOUND].max_workers =
task_rlimit(current, RLIMIT_NPROC);