Re: [PATCH 1/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 4, 2024 at 3:59 AM Tao Su <tao1.su@xxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jan 03, 2024 at 08:34:16PM -0800, Jim Mattson wrote:
> > On Wed, Jan 3, 2024 at 7:40 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Jan 3, 2024 at 6:45 PM Chao Gao <chao.gao@xxxxxxxxx> wrote:
> > > >
> > > > On Wed, Jan 03, 2024 at 10:04:41AM -0800, Sean Christopherson wrote:
> > > > >On Tue, Jan 02, 2024, Jim Mattson wrote:
> > > > >> On Tue, Jan 2, 2024 at 3:24 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > > >> >
> > > > >> > On Thu, Dec 21, 2023, Xu Yilun wrote:
> > > > >> > > On Wed, Dec 20, 2023 at 08:28:06AM -0800, Sean Christopherson wrote:
> > > > >> > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > >> > > > > index c57e181bba21..72634d6b61b2 100644
> > > > >> > > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > >> > > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > >> > > > > @@ -5177,6 +5177,13 @@ void __kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu,
> > > > >> > > > >   reset_guest_paging_metadata(vcpu, mmu);
> > > > >> > > > >  }
> > > > >> > > > >
> > > > >> > > > > +/* guest-physical-address bits limited by TDP */
> > > > >> > > > > +unsigned int kvm_mmu_tdp_maxphyaddr(void)
> > > > >> > > > > +{
> > > > >> > > > > + return max_tdp_level == 5 ? 57 : 48;
> > > > >> > > >
> > > > >> > > > Using "57" is kinda sorta wrong, e.g. the SDM says:
> > > > >> > > >
> > > > >> > > >   Bits 56:52 of each guest-physical address are necessarily zero because
> > > > >> > > >   guest-physical addresses are architecturally limited to 52 bits.
> > > > >> > > >
> > > > >> > > > Rather than split hairs over something that doesn't matter, I think it makes sense
> > > > >> > > > for the CPUID code to consume max_tdp_level directly (I forgot that max_tdp_level
> > > > >> > > > is still accurate when tdp_root_level is non-zero).
> > > > >> > >
> > > > >> > > It is still accurate for now. Only AMD SVM sets tdp_root_level the same as
> > > > >> > > max_tdp_level:
> > > > >> > >
> > > > >> > >       kvm_configure_mmu(npt_enabled, get_npt_level(),
> > > > >> > >                         get_npt_level(), PG_LEVEL_1G);
> > > > >> > >
> > > > >> > > But I wanna doulbe confirm if directly using max_tdp_level is fully
> > > > >> > > considered.  In your last proposal, it is:
> > > > >> > >
> > > > >> > >   u8 kvm_mmu_get_max_tdp_level(void)
> > > > >> > >   {
> > > > >> > >       return tdp_root_level ? tdp_root_level : max_tdp_level;
> > > > >> > >   }
> > > > >> > >
> > > > >> > > and I think it makes more sense, because EPT setup follows the same
> > > > >> > > rule.  If any future architechture sets tdp_root_level smaller than
> > > > >> > > max_tdp_level, the issue will happen again.
> > > > >> >
> > > > >> > Setting tdp_root_level != max_tdp_level would be a blatant bug.  max_tdp_level
> > > > >> > really means "max possible TDP level KVM can use".  If an exact TDP level is being
> > > > >> > forced by tdp_root_level, then by definition it's also the max TDP level, because
> > > > >> > it's the _only_ TDP level KVM supports.
> > > > >>
> > > > >> This is all just so broken and wrong. The only guest.MAXPHYADDR that
> > > > >> can be supported under TDP is the host.MAXPHYADDR. If KVM claims to
> > > > >> support a smaller guest.MAXPHYADDR, then KVM is obligated to intercept
> > > > >> every #PF,
> > > >
> > > > in this case (i.e., to support 48-bit guest.MAXPHYADDR when CPU supports only
> > > > 4-level EPT), KVM has no need to intercept #PF because accessing a GPA with
> > > > RSVD bits 51-48 set leads to EPT violation.
> > >
> > > At the completion of the page table walk, if there is a permission
> > > fault, the data address should not be accessed, so there should not be
> > > an EPT violation. Remember Meltdown?
> > >
> > > > >> and to emulate the faulting instruction to see if the RSVD
> > > > >> bit should be set in the error code. Hardware isn't going to do it.
> > > >
> > > > Note for EPT violation VM exits, the CPU stores the GPA that caused this exit
> > > > in "guest-physical address" field of VMCS. so, it is not necessary to emulate
> > > > the faulting instruction to determine if any RSVD bit is set.
> > >
> > > There should not be an EPT violation in the case discussed.
> >
> > For intercepted #PF, we can use CR2 to determine the necessary page
> > walk, and presumably the rest of the bits in the error code are
> > already set, so emulation is not necessary.
> >
> > However, emulation is necessary when synthesizing a #PF from an EPT
> > violation, and bit 8 of the exit qualification is clear. See
> > https://lore.kernel.org/kvm/4463f391-0a25-017e-f913-69c297e13c5e@xxxxxxxxxx/.
>
> Although not all memory-accessing instructions are emulated, it covers most common
> cases and is always better than KVM hangs anyway. We may probably continue to
> improve allow_smaller_maxphyaddr, but KVM should report the maximum physical width
> it supports.

KVM can only support the host MAXPHYADDR. If EPT on the CPU doesn't
support host MAXPHYADDR, it should be disabled. Shadow paging can
handle host MAXPHYADDR just fine.

KVM simply does not work when guest MAXPHYADDR < host MAXPHYADDR.
Without additional hardware support, no hypervisor can. I asked Intel
to add hardware support for such configurations about 15 years ago. I
have yet to see it.

> Thanks,
> Tao
>





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux