On Tue, Oct 5, 2021 at 9:22 AM Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > On Tue, Oct 05, 2021 at 01:25:52PM +0200, Paolo Bonzini wrote: > > On 05/10/21 12:58, Marcelo Tosatti wrote: > > > > There are other effects of cgroups (e.g. memory accounting) than just the > > > > cpumask; > > > > > > Is kvm-nx-hpage using significant amounts of memory? > > > > No, that was just an example (and not a good one indeed, because > > kvm-nx-hpage is not using a substantial amount of either memory or CPU). > > But for example vhost also uses cgroup_attach_task_all, so it should have > > the same issue with SCHED_FIFO? > > Yes. Would need to fix vhost as well. > > > > > > > Why doesn't the scheduler move the task to a CPU that is not being hogged by > > > > vCPU SCHED_FIFO tasks? > > > Because cpuset placement is enforced: > > > > Yes, but I would expect the parent cgroup to include both isolated CPUs (for > > the vCPU threads) and non-isolated housekeeping vCPUs (for the QEMU I/O > > thread). > > Yes, the parent, but why would that matter? If you are in a child > cpuset, you are restricted to the child cpuset mask (and not the > parents). Yes, and at the time of cpuset_attach, the task is attached to any one of the CPUs that are in the effective cpumask. > > > The QEMU I/O thread is not hogging the CPU 100% of the time, and > > therefore the nx-recovery thread should be able to run on that CPU. > > Yes, but: > > 1) The cpumask of the parent thread is not inherited > > set_cpus_allowed_ptr(task, housekeeping_cpumask(HK_FLAG_KTHREAD)); > > On __kthread_create_on_node should fail (because its cgroup, the one > inherited from QEMU, contains only isolated CPUs). > Just to confirm, do you mean fail only for unbounded kthreads? -- Thanks Nitesh