Re: [PATCH 1/2] x86: KVM: Limit guest physical bits when 5-level EPT is unsupported

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 05, 2024 at 12:26:08PM -0800, Sean Christopherson wrote:
> On Thu, Jan 04, 2024, Jim Mattson wrote:
> > On Thu, Jan 4, 2024 at 7:08 AM Chao Gao <chao.gao@xxxxxxxxx> wrote:
> > >
> > > On Wed, Jan 03, 2024 at 07:40:02PM -0800, Jim Mattson wrote:
> > > >On Wed, Jan 3, 2024 at 6:45 PM Chao Gao <chao.gao@xxxxxxxxx> wrote:
> > > >>
> > > >> On Wed, Jan 03, 2024 at 10:04:41AM -0800, Sean Christopherson wrote:
> > > >> >On Tue, Jan 02, 2024, Jim Mattson wrote:
> > > >> >> This is all just so broken and wrong. The only guest.MAXPHYADDR that
> > > >> >> can be supported under TDP is the host.MAXPHYADDR. If KVM claims to
> > > >> >> support a smaller guest.MAXPHYADDR, then KVM is obligated to intercept
> > > >> >> every #PF,
> > > >>
> > > >> in this case (i.e., to support 48-bit guest.MAXPHYADDR when CPU supports only
> > > >> 4-level EPT), KVM has no need to intercept #PF because accessing a GPA with
> > > >> RSVD bits 51-48 set leads to EPT violation.
> > > >
> > > >At the completion of the page table walk, if there is a permission
> > > >fault, the data address should not be accessed, so there should not be
> > > >an EPT violation. Remember Meltdown?
> > >
> > > You are right. I missed this case. KVM needs to intercept #PF to set RSVD bit
> > > in PFEC.
> > 
> > I have no problem with a user deliberately choosing an unsupported
> > configuration, but I do have a problem with KVM_GET_SUPPORTED_CPUID
> > returning an unsupported configuration.
> 
> +1
> 
> Advertising guest.MAXPHYADDR < host.MAXPHYADDR in KVM_GET_SUPPORTED_CPUID simply
> isn't viable when TDP is enabled.  I suppose KVM do so when allow_smaller_maxphyaddr
> is enabled, but that's just asking for confusion, e.g. if userspace reflects the
> CPUID back into the guest, it could unknowingly create a VM that depends on
> allow_smaller_maxphyaddr.
> 
> I think the least awful option is to have the kernel expose whether or not the
> CPU support 5-level EPT to userspace.  That doesn't even require new uAPI per se,
> just a new flag in /proc/cpuinfo.  It'll be a bit gross for userspace to parse,
> but it's not the end of the world.  Alternatively, KVM could add a capability to
> enumerate the max *addressable* GPA, but userspace would still need to manually
> take action when KVM can't address all of memory, i.e. a capability would be less
> ugly, but wouldn't meaningfully change userspace's responsibilities.

Yes, exposing whether the CPU support 5-level EPT is indeed a better solution, it
not only bypasses the KVM_GET_SUPPORTED_CPUID but also provides the information to
userspace.

I think a new KVM capability to enumerate the max GPA is better since it will be
more convenient if userspace wants to use, e.g., automatically limit physical bits
or the GPA in the user memory region.

But only reporting this capability can’t solve the KVM hang issue, userspace can
choose to ignore the max GPA, e.g., six selftests in changelog are still failed. I
think we can reconsider patch2 if we don’t advertise
guest.MAXPHYADDR < host.MAXPHYADDR in KVM_GET_SUPPORTED_CPUID.

Thanks,
Tao

> 
> I.e.
> 
> diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
> index c6a7eed03914..266daf5b5b84 100644
> --- a/arch/x86/include/asm/vmxfeatures.h
> +++ b/arch/x86/include/asm/vmxfeatures.h
> @@ -25,6 +25,7 @@
>  #define VMX_FEATURE_EPT_EXECUTE_ONLY   ( 0*32+ 17) /* "ept_x_only" EPT entries can be execute only */
>  #define VMX_FEATURE_EPT_AD             ( 0*32+ 18) /* EPT Accessed/Dirty bits */
>  #define VMX_FEATURE_EPT_1GB            ( 0*32+ 19) /* 1GB EPT pages */
> +#define VMX_FEATURE_EPT_5LEVEL         ( 0*32+ 20) /* 5-level EPT paging */
>  
>  /* Aggregated APIC features 24-27 */
>  #define VMX_FEATURE_FLEXPRIORITY       ( 0*32+ 24) /* TPR shadow + virt APIC */
> diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
> index 03851240c3e3..1640ae76548f 100644
> --- a/arch/x86/kernel/cpu/feat_ctl.c
> +++ b/arch/x86/kernel/cpu/feat_ctl.c
> @@ -72,6 +72,8 @@ static void init_vmx_capabilities(struct cpuinfo_x86 *c)
>                 c->vmx_capability[MISC_FEATURES] |= VMX_F(EPT_AD);
>         if (ept & VMX_EPT_1GB_PAGE_BIT)
>                 c->vmx_capability[MISC_FEATURES] |= VMX_F(EPT_1GB);
> +       if (ept & VMX_EPT_PAGE_WALK_5_BIT)
> +               c->vmx_capability[MISC_FEATURES] |= VMX_F(EPT_5LEVEL);
>  
>         /* Synthetic APIC features that are aggregates of multiple features. */
>         if ((c->vmx_capability[PRIMARY_CTLS] & VMX_F(VIRTUAL_TPR)) &&
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux