On Tue, Jul 16, 2024 at 01:17:27PM -0700, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > On Mon, Jul 15, 2024, isaku.yamahata@xxxxxxxxx wrote: > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > Add GPA limit check to kvm_arch_vcpu_pre_fault_memory() with guest > > maxphyaddr and kvm_mmu_max_gfn(). > > > > The KVM page fault handler decides which level of TDP to use, 4-level TDP > > or 5-level TDP based on guest maxphyaddr (CPUID[0x80000008].EAX[7:0]), the > > host maxphyaddr, and whether the host supports 5-level TDP or not. The > > 4-level TDP can map GPA up to 48 bits, and the 5-level TDP can map GPA up > > to 52 bits. If guest maxphyaddr <= 48, KVM uses 4-level TDP even when the > > host supports 5-level TDP. > > > > If we pass GPA > beyond the TDP mappable limit to the TDP MMU fault handler > > (concretely GPA > 48-bits with 4-level TDP), it will operate on GPA without > > upper bits, (GPA & ((1UL < 48) - 1)), not the specified GPA. It is not > > expected behavior. It wrongly maps GPA without upper bits with the page > > for GPA with upper bits. > > > > KVM_PRE_FAULT_MEMORY calls x86 KVM page fault handler, kvm_tdp_page_fault() > > with a user-space-supplied GPA without the limit check so that the user > > space can trigger WARN_ON_ONCE(). Check the GPA limit to fix it. > > Which WARN? Sorry, I confused with the local changes for 4/5-level. > > > - For non-TDX case (DEFAULT_VM, SW_PROTECTED_VM, or SEV): > > When the host supports 5-level TDP, KVM decides to use 4-level TDP if > > cpuid_maxphyaddr() <= 48. cpuid_maxhyaddr() check prevents > > KVM_PRE_FAULT_MEMORY from passing GFN beyond mappable GFN. > > Hardening against cpuid_maxphyaddr() should be out of scope. We don't enforce > it for guest faults, e.g. KVM doesn't kill the guest if allow_smaller_maxphyaddr > is false and the GPA is supposed to be illegal. And trying to enforce it here is > a fool's errand since userspace can simply do KVM_SET_CPUID2 to circumvent the > restriction. Ok, I'll drop maxphys addr check. > > - For TDX case: > > We'd like to exclude shared bit (or gfn_direct_mask in [1]) from GPA > > passed to the TDP MMU so that the TDP MMU can handle Secure-EPT or > > Shared-EPT (direct or mirrored in [1]) without explicitly > > setting/clearing the GPA (except setting up the TDP iterator, > > tdp_iter_refresh_sptep()). We'd like to make kvm_mmu_max_gfn() per VM > > for TDX to be 52 or 47 independent of the guest maxphyaddr with other > > patches. > > > > Fixes: 6e01b7601dfe ("KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()") > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > --- > > arch/x86/kvm/mmu/mmu.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 4e0e9963066f..6ee5af55cee1 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -4756,6 +4756,11 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, > > u64 end; > > int r; > > > > + if (range->gpa >= (1UL << cpuid_maxphyaddr(vcpu))) > > + return -E2BIG; > > + if (gpa_to_gfn(range->gpa) > kvm_mmu_max_gfn()) > > + return -E2BIG; > > > Related to my thoughts on making kvm_mmu_max_gfn() and rejecting aliased memslots, > I think we should add a common helper that's used by kvm_arch_prepare_memory_region() > and kvm_arch_vcpu_pre_fault_memory() to reject GPAs that are disallowed. > > https://lore.kernel.org/all/ZpbKqG_ZhCWxl-Fc@xxxxxxxxxx I'll look into it. I'll combine two patches into single patch series. -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>