VM worker kthreads can linger in the VM process's cgroup for sometime after KVM terminates the VM process. KVM terminates the worker kthreads by calling kthread_stop() which waits on the 'exited' completion, triggered by exit_mm(), via mm_release(), in do_exit() during the kthread's exit. However, these kthreads are removed from the cgroup using the cgroup_exit() which happens after the exit_mm(). Therefore, a VM process can terminate in between the exit_mm() and cgroup_exit() calls, leaving only worker kthreads in the cgroup. Moving worker kthreads back to the original cgroup (kthreadd_task's cgroup) makes sure that the cgroup is empty as soon as the main VM process is terminated. Signed-off-by: Vipin Sharma <vipinsh@xxxxxxxxxx> --- v3: - Use 'current->real_parent' (kthreadd_task) in the cgroup_attach_task_all() call. - Revert cgroup APIs changes in v2. Now, patch does not touch cgroup APIs. - Update commit and comment message v2: https://lore.kernel.org/lkml/20211222225350.1912249-1-vipinsh@xxxxxxxxxx/ - Use kthreadd_task in the cgroup API to avoid build issue. v1: https://lore.kernel.org/lkml/20211214050708.4040200-1-vipinsh@xxxxxxxxxx/ virt/kvm/kvm_main.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 83c57bcc6eb6..2c9dcfffb606 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5813,7 +5813,7 @@ static int kvm_vm_worker_thread(void *context) struct kvm *kvm = init_context->kvm; kvm_vm_thread_fn_t thread_fn = init_context->thread_fn; uintptr_t data = init_context->data; - int err; + int err, reattach_err; err = kthread_park(current); /* kthread_park(current) is never supposed to return an error */ @@ -5836,7 +5836,7 @@ static int kvm_vm_worker_thread(void *context) init_context = NULL; if (err) - return err; + goto out; /* Wait to be woken up by the spawner before proceeding. */ kthread_parkme(); @@ -5844,6 +5844,23 @@ static int kvm_vm_worker_thread(void *context) if (!kthread_should_stop()) err = thread_fn(kvm, data); +out: + /* + * Move kthread back to its original cgroup to prevent it lingering in + * the cgroup of the VM process, after the latter finishes its + * execution. + * + * kthread_stop() waits on the 'exited' completion condition which is + * set in exit_mm(), via mm_release(), in do_exit(). However, the + * kthread is removed from the cgroup in the cgroup_exit() which is + * called after the exit_mm(). This causes the kthread_stop() to return + * before the kthread actually quits the cgroup. + */ + reattach_err = cgroup_attach_task_all(current->real_parent, current); + if (reattach_err) { + kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", + __func__, reattach_err); + } return err; } base-commit: db6e7adf8de9b3b99a9856acb73870cc3a70e3ca -- 2.35.1.265.g69c8d7142f-goog