Re: [PATCH v2 12/25] KVM: x86/mmu: cleanup computation of MMU roles for two-dimensional paging

Sean Christopherson <seanjc@xxxxxxxxxx> · Tue, 8 Mar 2022 18:44:49 +0000

On Tue, Mar 08, 2022, Paolo Bonzini wrote:
> On 3/8/22 19:11, Sean Christopherson wrote:
> > On Mon, Feb 21, 2022, Paolo Bonzini wrote:
> > > Extended bits are unnecessary because page walking uses the CPU mode,
> > > and EFER.NX/CR0.WP can be set to one unconditionally---matching the
> > > format of shadow pages rather than the format of guest pages.
> > 
> > But they don't match the format of shadow pages.  EPT has an equivalent to NX in
> > that KVM can always clear X, but KVM explicitly supports running with EPT and
> > EFER.NX=0 in the host (32-bit non-PAE kernels).
> 
> In which case bit 2 of EPTs doesn't change meaning, does it?
> 
> > CR0.WP equally confusing.  Yes, both EPT and NPT enforce write protection at all
> > times, but EPT has no concept of user vs. supervisor in the EPT tables themselves,
> > at least with respect to writes (thanks mode-based execution for the qualifier...).
> > NPT is even worse as the APM explicitly states:
> > 
> >    The host hCR0.WP bit is ignored under nested paging.
> > 
> > Unless there's some hidden dependency I'm missing, I'd prefer we arbitrarily leave
> > them zero.
> 
> Setting EFER.NX=0 might be okay for EPT/NPT, but I'd prefer to set it
> respectively to 1 (X bit always present) and host EFER.NX (NX bit present
> depending on host EFER).
> 
> For CR0.WP it should really be 1 in my opinion, because CR0.WP=0 implies
> having a concept of user vs. supervisor access: CR0.WP=1 is the "default",
> while CR0.WP=0 is "always allow *supervisor* writes".

Yeah, I think we generally agree, just came to different conclusions :-)  I'm
totally fine setting them to '1', especially given the patch I just "posted",
but please add comments (suggested NX comment below).  The explicit "WP is ignored"
blurb for hCR0 on NPT will be especially confusing at some point.

With efer_nx forced to '1', we can do this somewhere in this series.  I really,
really despise "context" :-).

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9c79a0927a48..657df7fd74bf 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4461,25 +4461,15 @@ static inline bool boot_cpu_is_amd(void)
        return shadow_x_mask == 0;
 }
 
-static void
-reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
+static void reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *mmu)
 {
-       /*
-        * KVM doesn't honor execute-protection from the host page tables, but
-        * NX is required and potentially used at any time by KVM for NPT, as
-        * the NX hugepages iTLB multi-hit mitigation is supported for any CPU
-        * despite no known AMD (and derivative) CPUs being affected by erratum.
-        */
-       bool efer_nx = true;
-
-       struct rsvd_bits_validate *shadow_zero_check;
        int i;
 
-       shadow_zero_check = &context->shadow_zero_check;
+       shadow_zero_check = &mmu->shadow_zero_check;
 
        if (boot_cpu_is_amd())
                __reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
-                                       context->shadow_root_level, efer_nx,
+                                       mmu->shadow_root_level, is_efer_nx(mmu),
                                        boot_cpu_has(X86_FEATURE_GBPAGES),
                                        false, true);
        else
@@ -4490,7 +4480,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
        if (!shadow_me_mask)
                return;
 
-       for (i = context->shadow_root_level; --i >= 0;) {
+       for (i = mmu->shadow_root_level; --i >= 0;) {
                shadow_zero_check->rsvd_bits_mask[0][i] &= ~shadow_me_mask;
                shadow_zero_check->rsvd_bits_mask[1][i] &= ~shadow_me_mask;
        }
@@ -4751,6 +4741,16 @@ kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
 
        role.base.access = ACC_ALL;
        role.base.cr0_wp = true;
+
+       /*
+        * KVM doesn't honor execute-protection from the host page tables, but
+        * NX is required and potentially used at any time by KVM for NPT, as
+        * the NX hugepages iTLB multi-hit mitigation is supported for any CPU
+        * despite no known AMD (and derivative) CPUs being affected by erratum.
+        *
+        * This is functionally accurate for EPT, if technically wrong, as KVM
+        * can always clear the X bit on EPT,
+        */
        role.base.efer_nx = true;
        role.base.smm = cpu_mode.base.smm;
        role.base.guest_mode = cpu_mode.base.guest_mode;