On Fri, Jun 07, 2024 at 11:39:14PM +0000, "Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> wrote: > On Fri, 2024-06-07 at 11:31 +0200, Paolo Bonzini wrote: > > > -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, > > > - int *root_level) > > > +static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 > > > *sptes, > > > + enum kvm_tdp_mmu_root_types root_type) > > > { > > > - struct kvm_mmu_page *root = root_to_sp(vcpu->arch.mmu->root.hpa); > > > + struct kvm_mmu_page *root = tdp_mmu_get_root(vcpu, root_type); > > > > I think this function should take the struct kvm_mmu_page * directly. > > > > > +{ > > > + *root_level = vcpu->arch.mmu->root_role.level; > > > + > > > + return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, KVM_DIRECT_ROOTS); > > > > Here you pass root_to_sp(vcpu->arch.mmu->root.hpa); > > I see. It is another case of more indirection to try to send the decision making > through the helpers. We can try to open code things more. > > > > > > +int kvm_tdp_mmu_get_walk_mirror_pfn(struct kvm_vcpu *vcpu, u64 gpa, > > > + kvm_pfn_t *pfn) > > > +{ > > > + u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte; > > > + int leaf; > > > + > > > + lockdep_assert_held(&vcpu->kvm->mmu_lock); > > > + > > > + rcu_read_lock(); > > > + leaf = __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, KVM_MIRROR_ROOTS); > > > > and likewise here. > > > > You might also consider using a kvm_mmu_root_info for the mirror root, > > even though the pgd field is not used. > > This came up on the last version actually. The reason against it was that it > used that tiny bit of extra memory for the pgd. It does look more symmetrical > though. > > > > > Then __kvm_tdp_mmu_get_walk() can take a struct kvm_mmu_root_info * instead. > > Ahh, I see. Yes, that's a good reason. > > > > > kvm_tdp_mmu_get_walk_mirror_pfn() doesn't belong in this series, but > > introducing __kvm_tdp_mmu_get_walk() can stay here. > > Ok, we can split it. > > > > > Looking at the later patch, which uses > > kvm_tdp_mmu_get_walk_mirror_pfn(), I think this function is a bit > > overkill. I'll do a pre-review of the init_mem_region function, > > especially the usage of kvm_gmem_populate: > > > > + slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); > > + if (!kvm_slot_can_be_private(slot) || !kvm_mem_is_private(kvm, gfn)) { > > + ret = -EFAULT; > > + goto out_put_page; > > + } > > > > The slots_lock is taken, so checking kvm_slot_can_be_private is unnecessary. > > > > Checking kvm_mem_is_private perhaps should also be done in > > kvm_gmem_populate() itself. I'll send a patch. > > > > + read_lock(&kvm->mmu_lock); > > + > > + ret = kvm_tdp_mmu_get_walk_mirror_pfn(vcpu, gpa, &mmu_pfn); > > + if (ret < 0) > > + goto out; > > + if (ret > PG_LEVEL_4K) { > > + ret = -EINVAL; > > + goto out; > > + } > > + if (mmu_pfn != pfn) { > > + ret = -EAGAIN; > > + goto out; > > + } > > > > If you require pre-faulting, you don't need to return mmu_pfn and > > things would be seriously wrong if the two didn't match, wouldn't > > they? > > Yea, I'm not sure why it would be a normal condition. Maybe Isaku can comment on > the thinking? Sean suggested for KVM_TDX_INIT_MEM_REGION to check if the two pfn from TDP MMU and guest_memfd match. As pointed out, the two PFNs should match under appropriate lock (or heavily broken). Personally I'm fine to remove such check and to avoid returning pfn. https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@xxxxxxxxxx/ Then KVM provides a dedicated TDX ioctl(), i.e. what is/was KVM_TDX_INIT_MEM_REGION, to do PAGE.ADD. KVM_TDX_INIT_MEM_REGION wouldn't need to map anything, it would simply need to verify that the pfn from guest_memfd() is the same as what's in the TDP MMU. -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>