Re: [PATCH 3/3] KVM:SVM: Enable INVPCID feature on AMD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/12/20 3:10 PM, Jim Mattson wrote:
> On Fri, Jun 12, 2020 at 12:35 PM Babu Moger <babu.moger@xxxxxxx> wrote:
>>
>>
>>
>>> -----Original Message-----
>>> From: Jim Mattson <jmattson@xxxxxxxxxx>
>>> Sent: Thursday, June 11, 2020 6:51 PM
>>> To: Moger, Babu <Babu.Moger@xxxxxxx>
>>> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>;
>>> the arch/x86 maintainers <x86@xxxxxxxxxx>; Sean Christopherson
>>> <sean.j.christopherson@xxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>;
>>> Borislav Petkov <bp@xxxxxxxxx>; H . Peter Anvin <hpa@xxxxxxxxx>; Paolo
>>> Bonzini <pbonzini@xxxxxxxxxx>; Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>;
>>> Thomas Gleixner <tglx@xxxxxxxxxxxxx>; LKML <linux-kernel@xxxxxxxxxxxxxxx>;
>>> kvm list <kvm@xxxxxxxxxxxxxxx>
>>> Subject: Re: [PATCH 3/3] KVM:SVM: Enable INVPCID feature on AMD
>>>
>>> On Thu, Jun 11, 2020 at 2:48 PM Babu Moger <babu.moger@xxxxxxx> wrote:
>>>>
>>>> The following intercept is added for INVPCID instruction:
>>>> Code    Name            Cause
>>>> A2h     VMEXIT_INVPCID  INVPCID instruction
>>>>
>>>> The following bit is added to the VMCB layout control area
>>>> to control intercept of INVPCID:
>>>> Byte Offset     Bit(s)    Function
>>>> 14h             2         intercept INVPCID
>>>>
>>>> For the guests with nested page table (NPT) support, the INVPCID
>>>> feature works as running it natively. KVM does not need to do any
>>>> special handling in this case.
>>>>
>>>> Interceptions are required in the following cases.
>>>> 1. If the guest tries to disable the feature when the underlying
>>>> hardware supports it. In this case hypervisor needs to report #UD.
>>>
>>> Per the AMD documentation, attempts to use INVPCID at CPL>0 will
>>> result in a #GP, regardless of the intercept bit. If the guest CPUID
>>> doesn't enumerate the feature, shouldn't the instruction raise #UD
>>> regardless of CPL? This seems to imply that we should intercept #GP
>>> and decode the instruction to see if we should synthesize #UD instead.
>>
>> Purpose here is to report UD when the guest CPUID doesn't enumerate the
>> INVPCID feature When Bare-metal supports it. It seems to work fine for
>> that purpose. You are right. The #GP for CPL>0 takes precedence over
>> interception. No. I am not planning to intercept GP.
> 
> WIthout intercepting #GP, you fail to achieve your stated purpose.

I think I have misunderstood this part. I was not inteding to change the
#GP behaviour. I will remove this part. My intension of these series is to
handle invpcid in shadow page mode. I have verified that part. Hope I did
not miss anything else.

> 
>> I will change the text. How about this?
>>
>> Interceptions are required in the following cases.
>> 1. If the guest CPUID doesn't enumerate the INVPCID feature when the
>> underlying hardware supports it,  hypervisor needs to report UD. However,
>> #GP for CPL>0 takes precedence over interception.
> 
> This text is not internally consistent. In one sentence, you say that
> "hypervisor needs to report #UD." In the next sentence, you are
> essentially saying that the hypervisor doesn't need to report #UD.
> Which is it?
> 
>>>> 2. When the guest is running with shadow page table enabled, in
>>>> this case the hypervisor needs to handle the tlbflush based on the
>>>> type of invpcid instruction type.
>>>>
>>>> AMD documentation for INVPCID feature is available at "AMD64
>>>> Architecture Programmer’s Manual Volume 2: System Programming,
>>>> Pub. 24593 Rev. 3.34(or later)"
>>>>
>>>> The documentation can be obtained at the links below:
>>>> Link:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.a
>>> md.com%2Fsystem%2Ffiles%2FTechDocs%2F24593.pdf&amp;data=02%7C01%7
>>> Cbabu.moger%40amd.com%7C36861b25f6d143e3b38e08d80e624472%7C3dd8
>>> 961fe4884e608e11a82d994e183d%7C0%7C0%7C637275163374103811&amp;s
>>> data=E%2Fdb6T%2BdO4nrtUoqhKidF6XyorsWrphj6O4WwNZpmYA%3D&amp;res
>>> erved=0
>>>> Link:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
>>> kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=02%7C01%7Cbabu.m
>>> oger%40amd.com%7C36861b25f6d143e3b38e08d80e624472%7C3dd8961fe488
>>> 4e608e11a82d994e183d%7C0%7C0%7C637275163374103811&amp;sdata=b81
>>> 9W%2FhKS93%2BAp3QvcsR0BwTQpUVUFMbIaNaisgWHRY%3D&amp;reserved=
>>> 0
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>>>> ---
>>>>  arch/x86/include/asm/svm.h      |    4 ++++
>>>>  arch/x86/include/uapi/asm/svm.h |    2 ++
>>>>  arch/x86/kvm/svm/svm.c          |   42
>>> +++++++++++++++++++++++++++++++++++++++
>>>>  3 files changed, 48 insertions(+)
>>>>
>>>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>>>> index 62649fba8908..6488094f67fa 100644
>>>> --- a/arch/x86/include/asm/svm.h
>>>> +++ b/arch/x86/include/asm/svm.h
>>>> @@ -55,6 +55,10 @@ enum {
>>>>         INTERCEPT_RDPRU,
>>>>  };
>>>>
>>>> +/* Extended Intercept bits */
>>>> +enum {
>>>> +       INTERCEPT_INVPCID = 2,
>>>> +};
>>>>
>>>>  struct __attribute__ ((__packed__)) vmcb_control_area {
>>>>         u32 intercept_cr;
>>>> diff --git a/arch/x86/include/uapi/asm/svm.h
>>> b/arch/x86/include/uapi/asm/svm.h
>>>> index 2e8a30f06c74..522d42dfc28c 100644
>>>> --- a/arch/x86/include/uapi/asm/svm.h
>>>> +++ b/arch/x86/include/uapi/asm/svm.h
>>>> @@ -76,6 +76,7 @@
>>>>  #define SVM_EXIT_MWAIT_COND    0x08c
>>>>  #define SVM_EXIT_XSETBV        0x08d
>>>>  #define SVM_EXIT_RDPRU         0x08e
>>>> +#define SVM_EXIT_INVPCID       0x0a2
>>>>  #define SVM_EXIT_NPF           0x400
>>>>  #define SVM_EXIT_AVIC_INCOMPLETE_IPI           0x401
>>>>  #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS     0x402
>>>> @@ -171,6 +172,7 @@
>>>>         { SVM_EXIT_MONITOR,     "monitor" }, \
>>>>         { SVM_EXIT_MWAIT,       "mwait" }, \
>>>>         { SVM_EXIT_XSETBV,      "xsetbv" }, \
>>>> +       { SVM_EXIT_INVPCID,     "invpcid" }, \
>>>>         { SVM_EXIT_NPF,         "npf" }, \
>>>>         { SVM_EXIT_AVIC_INCOMPLETE_IPI,         "avic_incomplete_ipi" }, \
>>>>         { SVM_EXIT_AVIC_UNACCELERATED_ACCESS,
>>> "avic_unaccelerated_access" }, \
>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>> index 285e5e1ff518..82d974338f68 100644
>>>> --- a/arch/x86/kvm/svm/svm.c
>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>> @@ -813,6 +813,11 @@ static __init void svm_set_cpu_caps(void)
>>>>         if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
>>>>             boot_cpu_has(X86_FEATURE_AMD_SSBD))
>>>>                 kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
>>>> +
>>>> +       /* Enable INVPCID if both PCID and INVPCID enabled */
>>>> +       if (boot_cpu_has(X86_FEATURE_PCID) &&
>>>> +           boot_cpu_has(X86_FEATURE_INVPCID))
>>>> +               kvm_cpu_cap_set(X86_FEATURE_INVPCID);
>>>>  }
>>>>
>>>>  static __init int svm_hardware_setup(void)
>>>> @@ -1099,6 +1104,17 @@ static void init_vmcb(struct vcpu_svm *svm)
>>>>                 clr_intercept(svm, INTERCEPT_PAUSE);
>>>>         }
>>>>
>>>> +       /*
>>>> +        * Intercept INVPCID instruction only if shadow page table is
>>>> +        * enabled. Interception is not required with nested page table.
>>>> +        */
>>>> +       if (boot_cpu_has(X86_FEATURE_INVPCID)) {
>>>> +               if (!npt_enabled)
>>>> +                       set_extended_intercept(svm, INTERCEPT_INVPCID);
>>>> +               else
>>>> +                       clr_extended_intercept(svm, INTERCEPT_INVPCID);
>>>> +       }
>>>> +
>>>>         if (kvm_vcpu_apicv_active(&svm->vcpu))
>>>>                 avic_init_vmcb(svm);
>>>>
>>>> @@ -2715,6 +2731,23 @@ static int mwait_interception(struct vcpu_svm
>>> *svm)
>>>>         return nop_interception(svm);
>>>>  }
>>>>
>>>> +static int invpcid_interception(struct vcpu_svm *svm)
>>>> +{
>>>> +       struct kvm_vcpu *vcpu = &svm->vcpu;
>>>> +       unsigned long type;
>>>> +       gva_t gva;
>>>> +
>>>> +       /*
>>>> +        * For an INVPCID intercept:
>>>> +        * EXITINFO1 provides the linear address of the memory operand.
>>>> +        * EXITINFO2 provides the contents of the register operand.
>>>> +        */
>>>> +       type = svm->vmcb->control.exit_info_2;
>>>> +       gva = svm->vmcb->control.exit_info_1;
>>>> +
>>>> +       return kvm_handle_invpcid_types(vcpu,  gva, type);
>>>> +}
>>>> +
>>>>  static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
>>>>         [SVM_EXIT_READ_CR0]                     = cr_interception,
>>>>         [SVM_EXIT_READ_CR3]                     = cr_interception,
>>>> @@ -2777,6 +2810,7 @@ static int (*const svm_exit_handlers[])(struct
>>> vcpu_svm *svm) = {
>>>>         [SVM_EXIT_MWAIT]                        = mwait_interception,
>>>>         [SVM_EXIT_XSETBV]                       = xsetbv_interception,
>>>>         [SVM_EXIT_RDPRU]                        = rdpru_interception,
>>>> +       [SVM_EXIT_INVPCID]                      = invpcid_interception,
>>>>         [SVM_EXIT_NPF]                          = npf_interception,
>>>>         [SVM_EXIT_RSM]                          = rsm_interception,
>>>>         [SVM_EXIT_AVIC_INCOMPLETE_IPI]          =
>>> avic_incomplete_ipi_interception,
>>>> @@ -3562,6 +3596,14 @@ static void svm_cpuid_update(struct kvm_vcpu
>>> *vcpu)
>>>>         svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
>>>>                              guest_cpuid_has(&svm->vcpu, X86_FEATURE_NRIPS);
>>>>
>>>> +       /*
>>>> +        * Intercept INVPCID instruction if the baremetal has the support
>>>> +        * but the guest doesn't claim the feature.
>>>> +        */
>>>> +       if (boot_cpu_has(X86_FEATURE_INVPCID) &&
>>>> +           !guest_cpuid_has(vcpu, X86_FEATURE_INVPCID))
>>>> +               set_extended_intercept(svm, INTERCEPT_INVPCID);
>>>> +
>>>
>>> What if INVPCID is enabled in the guest CPUID later? Shouldn't we then
>>> clear this intercept bit?
>>
>> I assume the feature enable comes in the same code path as this. I can add
>> "if else" check here if that is what you are suggesting.
> 
> Yes, that's what I'm suggesting.
> 
>>>
>>>>         if (!kvm_vcpu_apicv_active(vcpu))
>>>>                 return;
>>>>
>>>>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux