On 12/10/20 3:36 PM, Jim Mattson wrote: > On Thu, Dec 10, 2020 at 1:26 PM Babu Moger <babu.moger@xxxxxxx> wrote: >> >> Hi Jim, >> >>> -----Original Message----- >>> From: Jim Mattson <jmattson@xxxxxxxxxx> >>> Sent: Monday, December 7, 2020 5:06 PM >>> To: Moger, Babu <Babu.Moger@xxxxxxx> >>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>; Thomas Gleixner >>> <tglx@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; Borislav Petkov >>> <bp@xxxxxxxxx>; Yu, Fenghua <fenghua.yu@xxxxxxxxx>; Tony Luck >>> <tony.luck@xxxxxxxxx>; Wanpeng Li <wanpengli@xxxxxxxxxxx>; kvm list >>> <kvm@xxxxxxxxxxxxxxx>; Lendacky, Thomas <Thomas.Lendacky@xxxxxxx>; >>> Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Sean Christopherson >>> <seanjc@xxxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>; the arch/x86 >>> maintainers <x86@xxxxxxxxxx>; kyung.min.park@xxxxxxxxx; LKML <linux- >>> kernel@xxxxxxxxxxxxxxx>; Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx>; H . >>> Peter Anvin <hpa@xxxxxxxxx>; mgross@xxxxxxxxxxxxxxx; Vitaly Kuznetsov >>> <vkuznets@xxxxxxxxxx>; Phillips, Kim <kim.phillips@xxxxxxx>; Huang2, Wei >>> <Wei.Huang2@xxxxxxx> >>> Subject: Re: [PATCH 2/2] KVM: SVM: Add support for Virtual SPEC_CTRL >>> >>> On Mon, Dec 7, 2020 at 2:38 PM Babu Moger <babu.moger@xxxxxxx> wrote: >>>> >>>> Newer AMD processors have a feature to virtualize the use of the >>>> SPEC_CTRL MSR. When supported, the SPEC_CTRL MSR is automatically >>>> virtualized and no longer requires hypervisor intervention. >>>> >>>> This feature is detected via CPUID function 0x8000000A_EDX[20]: >>>> GuestSpecCtrl. >>>> >>>> Hypervisors are not required to enable this feature since it is >>>> automatically enabled on processors that support it. >>>> >>>> When this feature is enabled, the hypervisor no longer has to >>>> intercept the usage of the SPEC_CTRL MSR and no longer is required to >>>> save and restore the guest SPEC_CTRL setting when switching >>>> hypervisor/guest modes. The effective SPEC_CTRL setting is the guest >>>> SPEC_CTRL setting or'ed with the hypervisor SPEC_CTRL setting. This >>>> allows the hypervisor to ensure a minimum SPEC_CTRL if desired. >>>> >>>> This support also fixes an issue where a guest may sometimes see an >>>> inconsistent value for the SPEC_CTRL MSR on processors that support >>>> this feature. With the current SPEC_CTRL support, the first write to >>>> SPEC_CTRL is intercepted and the virtualized version of the SPEC_CTRL >>>> MSR is not updated. When the guest reads back the SPEC_CTRL MSR, it >>>> will be 0x0, instead of the actual expected value. There isn’t a >>>> security concern here, because the host SPEC_CTRL value is or’ed with >>>> the Guest SPEC_CTRL value to generate the effective SPEC_CTRL value. >>>> KVM writes with the guest's virtualized SPEC_CTRL value to SPEC_CTRL >>>> MSR just before the VMRUN, so it will always have the actual value >>>> even though it doesn’t appear that way in the guest. The guest will >>>> only see the proper value for the SPEC_CTRL register if the guest was >>>> to write to the SPEC_CTRL register again. With Virtual SPEC_CTRL >>>> support, the MSR interception of SPEC_CTRL is disabled during >>>> vmcb_init, so this will no longer be an issue. >>>> >>>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >>>> --- >>> >>> Shouldn't there be some code to initialize a new "guest SPEC_CTRL" >>> value in the VMCB, both at vCPU creation, and at virtual processor reset? >> >> Yes, I think so. I will check on this. >> >>> >>>> arch/x86/kvm/svm/svm.c | 17 ++++++++++++++--- >>>> 1 file changed, 14 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index >>>> 79b3a564f1c9..3d73ec0cdb87 100644 >>>> --- a/arch/x86/kvm/svm/svm.c >>>> +++ b/arch/x86/kvm/svm/svm.c >>>> @@ -1230,6 +1230,14 @@ static void init_vmcb(struct vcpu_svm *svm) >>>> >>>> svm_check_invpcid(svm); >>>> >>>> + /* >>>> + * If the host supports V_SPEC_CTRL then disable the interception >>>> + * of MSR_IA32_SPEC_CTRL. >>>> + */ >>>> + if (boot_cpu_has(X86_FEATURE_V_SPEC_CTRL)) >>>> + set_msr_interception(&svm->vcpu, svm->msrpm, >>> MSR_IA32_SPEC_CTRL, >>>> + 1, 1); >>>> + >>>> if (kvm_vcpu_apicv_active(&svm->vcpu)) >>>> avic_init_vmcb(svm); >>>> >>>> @@ -3590,7 +3598,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct >>> kvm_vcpu *vcpu) >>>> * is no need to worry about the conditional branch over the wrmsr >>>> * being speculatively taken. >>>> */ >>>> - x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl); >>>> + if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL)) >>>> + x86_spec_ctrl_set_guest(svm->spec_ctrl, >>>> + svm->virt_spec_ctrl); >>> >>> Is this correct for the nested case? Presumably, there is now a "guest >>> SPEC_CTRL" value somewhere in the VMCB. If L1 does not intercept this MSR, >>> then we need to transfer the "guest SPEC_CTRL" value from the >>> vmcb01 to the vmcb02, don't we? >> >> Here is the text from to be published documentation. >> "When in host mode, the host SPEC_CTRL value is in effect and writes >> update only the host version of SPEC_CTRL. On a VMRUN, the processor loads >> the guest version of SPEC_CTRL from the VMCB. For non- SNP enabled guests, >> processor behavior is controlled by the logical OR of the two registers. >> When the guest writes SPEC_CTRL, only the guest version is updated. On a >> VMEXIT, the guest version is saved into the VMCB and the processor returns >> to only using the host SPEC_CTRL for speculation control. The guest >> SPEC_CTRL is located at offset 0x2E0 in the VMCB." This offset is into >> the save area of the VMCB (i.e. 0x400 + 0x2E0). >> >> The feature X86_FEATURE_V_SPEC_CTRL will not be advertised to guests. >> So, the guest will use the same mechanism as today where it will save and >> restore the value into/from svm->spec_ctrl. If the value saved in the VMSA >> is left untouched, both an L1 and L2 guest will get the proper value. >> Thing that matters is the initial setup of vmcb01 and vmcb02 when this >> feature is available in host(bare metal). I am going to investigate that >> part. Do you still think I am missing something here? > > It doesn't matter whether X86_FEATURE_V_SPEC_CTRL is advertised to L1 > or not. If L1 doesn't virtualize MSR_SPEC_CTRL for L2, then L1 and L2 > share the same value for that MSR. With this change, the current value > in vmcb01 is only in vmcb01, and doesn't get propagated anywhere else. > Hence, if L1 changes the value of MSR_SPEC_CTRL, that change is not > visible to L2. > > Thinking about what Sean said about live migration, I think the > correct solution here is that the authoritative value for this MSR > should continue to live in svm->spec_ctrl. When the CPU supports > X86_FEATURE_V_SPEC_CTRL, we should just transfer the value into the > VMCB prior to VMRUN and out of the VMCB after #VMEXIT. Ok. Got it. I will try this approach. Thanks for the suggestion. > >> >>> >>>> svm_vcpu_enter_exit(vcpu, svm); >>>> >>>> @@ -3609,12 +3618,14 @@ static __no_kcsan fastpath_t >>> svm_vcpu_run(struct kvm_vcpu *vcpu) >>>> * If the L02 MSR bitmap does not intercept the MSR, then we need to >>>> * save it. >>>> */ >>>> - if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))) >>>> + if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) && >>>> + unlikely(!msr_write_intercepted(vcpu, >>>> + MSR_IA32_SPEC_CTRL))) >>>> svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL); >>> >>> Is this correct for the nested case? If L1 does not intercept this MSR, then it >>> might have changed while L2 is running. Presumably, the hardware has stored >>> the new value somewhere in the vmcb02 at #VMEXIT, but now we need to move >>> that value into the vmcb01, don't we? >>> >>>> reload_tss(vcpu); >>>> >>>> - x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl); >>>> + if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL)) >>>> + x86_spec_ctrl_restore_host(svm->spec_ctrl, >>>> + svm->virt_spec_ctrl); >>>> >>>> vcpu->arch.cr2 = svm->vmcb->save.cr2; >>>> vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax; >>>> >>> >>> It would be great if you could add some tests to kvm-unit-tests. >> >> Yes. I will check on this part. >> >> Thanks >> Babu