On Tue, Jan 14, 2025, Paolo Bonzini wrote: > On 1/13/25 16:35, Keith Busch wrote: > > > Ok, I found the code and it doesn't exec (e.g. > > > https://github.com/google/crosvm/blob/b339d3d7/src/crosvm/sys/linux/jail_warden.rs#L122), > > > so that's not an option. Well, if I understand correctly from a > > > cursory look at the code, crosvm is creating a jailed child process > > > early, and then spawns further jails through it; so it's just this > > > first process that has to cheat. > > > > > > One possibility on the KVM side is to delay creating the vhost_task > > > until the first KVM_RUN. I don't like it but... > > > > This option is actually kind of appealing in that we don't need to > > change any application side to filter out kernel tasks, as well as not > > having a new kernel dependency to even report these types of tasks as > > kernel threads. > > > > I gave it a quick try. I'm not very familiar with the code here, so not > > sure if this is thread safe or not, It's not. > > but it did successfully get crosvm booting again. > > That looks good to me too. Would you like to send it with a commit message > and SoB? > > --- > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 2401606db2604..422b6b06de4fe 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -7415,6 +7415,8 @@ int kvm_mmu_post_init_vm(struct kvm *kvm) > > { > > if (nx_hugepage_mitigation_hard_disabled) > > return 0; > > + if (kvm->arch.nx_huge_page_recovery_thread) > > + return 0; ... > > kvm->arch.nx_huge_page_last = get_jiffies_64(); > > kvm->arch.nx_huge_page_recovery_thread = vhost_task_create( > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index c79a8cc57ba42..263363c46626b 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -11463,6 +11463,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) > > struct kvm_run *kvm_run = vcpu->run; > > int r; > > + r = kvm_mmu_post_init_vm(vcpu->kvm); > > + if (r) > > + return r; The only lock held at this point is vcpu->mutex, the obvious choices for guarding the per-VM task creation are kvm->lock or kvm->mmu_lock, but we definitely don't want to blindly take either lock in KVM_RUN.