On Tue, Feb 27, 2024 at 06:41:36PM -0800, Sean Christopherson wrote: > Add and use a synthetic, KVM-defined page fault error code to indicate > whether a fault is to private vs. shared memory. TDX and SNP have > different mechanisms for reporting private vs. shared, and KVM's > software-protected VMs have no mechanism at all. Usurp an error code > flag to avoid having to plumb another parameter to kvm_mmu_page_fault() > and friends. > > Alternatively, KVM could borrow AMD's PFERR_GUEST_ENC_MASK, i.e. set it > for TDX and software-protected VMs as appropriate, but that would require > *clearing* the flag for SEV and SEV-ES VMs, which support encrypted > memory at the hardware layer, but don't utilize private memory at the > KVM layer. I see this alternative in other patchset. And I still don't understand why synthetic way is better after reading patch #5-7. I assume for SEV(-ES) if there is spurious PFERR_GUEST_ENC flag set in error code when private memory is not used in KVM, then it is a HW issue or SW bug that needs to be caught and warned, and by the way cleared. Thanks, Yilun > > Opportunistically add a comment to call out that the logic for software- > protected VMs is (and was before this commit) broken for nested MMUs, i.e. > for nested TDP, as the GPA is an L2 GPA. Punt on trying to play nice with > nested MMUs as there is a _lot_ of functionality that simply doesn't work > for software-protected VMs, e.g. all of the paths where KVM accesses guest > memory need to be updated to be aware of private vs. shared memory. > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > arch/x86/include/asm/kvm_host.h | 11 +++++++++++ > arch/x86/kvm/mmu/mmu.c | 26 +++++++++++++++++++------- > arch/x86/kvm/mmu/mmu_internal.h | 2 +- > 3 files changed, 31 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 1e69743ef0fb..4077c46c61ab 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -267,7 +267,18 @@ enum x86_intercept_stage; > #define PFERR_GUEST_ENC_MASK BIT_ULL(34) > #define PFERR_GUEST_SIZEM_MASK BIT_ULL(35) > #define PFERR_GUEST_VMPL_MASK BIT_ULL(36) > + > +/* > + * IMPLICIT_ACCESS is a KVM-defined flag used to correctly perform SMAP checks > + * when emulating instructions that triggers implicit access. > + */ > #define PFERR_IMPLICIT_ACCESS BIT_ULL(48) > +/* > + * PRIVATE_ACCESS is a KVM-defined flag us to indicate that a fault occurred > + * when the guest was accessing private memory. > + */ > +#define PFERR_PRIVATE_ACCESS BIT_ULL(49) > +#define PFERR_SYNTHETIC_MASK (PFERR_IMPLICIT_ACCESS | PFERR_PRIVATE_ACCESS) > > #define PFERR_NESTED_GUEST_PAGE (PFERR_GUEST_PAGE_MASK | \ > PFERR_WRITE_MASK | \ > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 408969ac1291..7807bdcd87e8 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5839,19 +5839,31 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err > bool direct = vcpu->arch.mmu->root_role.direct; > > /* > - * IMPLICIT_ACCESS is a KVM-defined flag used to correctly perform SMAP > - * checks when emulating instructions that triggers implicit access. > * WARN if hardware generates a fault with an error code that collides > - * with the KVM-defined value. Clear the flag and continue on, i.e. > - * don't terminate the VM, as KVM can't possibly be relying on a flag > - * that KVM doesn't know about. > + * with KVM-defined sythentic flags. Clear the flags and continue on, > + * i.e. don't terminate the VM, as KVM can't possibly be relying on a > + * flag that KVM doesn't know about. > */ > - if (WARN_ON_ONCE(error_code & PFERR_IMPLICIT_ACCESS)) > - error_code &= ~PFERR_IMPLICIT_ACCESS; > + if (WARN_ON_ONCE(error_code & PFERR_SYNTHETIC_MASK)) > + error_code &= ~PFERR_SYNTHETIC_MASK; > > if (WARN_ON_ONCE(!VALID_PAGE(vcpu->arch.mmu->root.hpa))) > return RET_PF_RETRY; > > + /* > + * Except for reserved faults (emulated MMIO is shared-only), set the > + * private flag for software-protected VMs based on the gfn's current > + * attributes, which are the source of truth for such VMs. Note, this > + * wrong for nested MMUs as the GPA is an L2 GPA, but KVM doesn't > + * currently supported nested virtualization (among many other things) > + * for software-protected VMs. > + */ > + if (IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && > + !(error_code & PFERR_RSVD_MASK) && > + vcpu->kvm->arch.vm_type == KVM_X86_SW_PROTECTED_VM && > + kvm_mem_is_private(vcpu->kvm, gpa_to_gfn(cr2_or_gpa))) > + error_code |= PFERR_PRIVATE_ACCESS; > + > r = RET_PF_INVALID; > if (unlikely(error_code & PFERR_RSVD_MASK)) { > r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct); > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h > index 1fab1f2359b5..d7c10d338f14 100644 > --- a/arch/x86/kvm/mmu/mmu_internal.h > +++ b/arch/x86/kvm/mmu/mmu_internal.h > @@ -306,7 +306,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, > .max_level = KVM_MAX_HUGEPAGE_LEVEL, > .req_level = PG_LEVEL_4K, > .goal_level = PG_LEVEL_4K, > - .is_private = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT), > + .is_private = err & PFERR_PRIVATE_ACCESS, > }; > int r; > > -- > 2.44.0.278.ge034bb2e1d-goog > >