On Wed, Jan 19, 2022 at 07:02:53PM +0100, Paolo Bonzini wrote: > On 1/18/22 21:39, Tejun Heo wrote: > > So, these are normally driven by the !populated events. That's how everyone > > else is doing it. If you want to tie the kvm workers lifetimes to kvm > > process, wouldn't it be cleaner to do so from kvm side? ie. let kvm process > > exit wait for the workers to be cleaned up. > > It does. For example kvm_mmu_post_init_vm's call to > kvm_vm_create_worker_thread is matched with the call to > kthread_stop in kvm_mmu_pre_destroy_vm. > According to Vpin, the problem is that there's a small amount of time > between the return from kthread_stop and the point where the cgroup > can be removed. My understanding of the race is the following: Okay, this is because kthread_stop piggy backs on vfork_done to wait for the task exit intead of the usual exit notification, so it only waits till exit_mm(), which is uhh... weird. So, migrating is one option, I guess, albeit a rather ugly one. It'd be nicer if we can make kthread_stop() waiting more regular but I couldn't find a good existing place and routing the usual parent signaling might be too complicated. Anyone has better ideas? Thanks. -- tejun