> From: Sean Christopherson <seanjc@xxxxxxxxxx> > Sent: Wednesday, December 29, 2021 9:05 AM > > On Wed, Dec 22, 2021, Jing Liu wrote: > > @@ -1968,6 +1969,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, > struct msr_data *msr_info) > > case MSR_IA32_XFD: > > ret = kvm_set_msr_common(vcpu, msr_info); > > if (!ret && data) { > > + vmx_disable_intercept_for_msr(vcpu, > MSR_IA32_XFD, MSR_TYPE_RW); > > + vcpu->arch.xfd_out_of_sync = true; > > xfd_out_of_sync is a poor name, as XFD _may_ be out of sync, or it may not. > It's > also confusing that it's kept set after XFD is explicitly synchronized in > vcpu_enter_guest(). yes, sync_xfd_after_exit might be more accurate. > > > + > > vcpu->arch.trap_nm = true; > > vmx_update_exception_bitmap(vcpu); > > Ah, this is why #NM interception was made sticky many patches ago. More > at the end. > > > } > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > > index bf9d3051cd6c..0a00242a91e7 100644 > > --- a/arch/x86/kvm/vmx/vmx.h > > +++ b/arch/x86/kvm/vmx/vmx.h > > @@ -340,7 +340,7 @@ struct vcpu_vmx { > > struct lbr_desc lbr_desc; > > > > /* Save desired MSR intercept (read: pass-through) state */ > > -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 14 > > +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 15 > > struct { > > DECLARE_BITMAP(read, > MAX_POSSIBLE_PASSTHROUGH_MSRS); > > DECLARE_BITMAP(write, > MAX_POSSIBLE_PASSTHROUGH_MSRS); > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 3b756ff13103..10a08aa2aa45 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -10024,6 +10024,9 @@ static int vcpu_enter_guest(struct kvm_vcpu > *vcpu) > > if (vcpu->arch.guest_fpu.xfd_err) > > wrmsrl(MSR_IA32_XFD_ERR, 0); > > > > + if (vcpu->arch.xfd_out_of_sync) > > Rather than adding a flag that tracks whether or not the MSR can be written > by > the guest, can't this be: > > if (!vmx_test_msr_bitmap_write(vcpu->loaded_vmcs->msr_bitmap)) > fpu_sync_guest_vmexit_xfd_state(); > We can use this > That might be marginally slower than checking a dedicated flag? But is has > the > advantage of doing the correct thing for nested guests instead of penalizing > them > with an unnecessary sync on every exit. If performance of the check is an > issue, > we could add a static key to skip the code unless at least one vCPU has > triggered > the XFD crud, a la kvm_has_noapic_vcpu (which may or may not provide any > real > performance benefits). > > Speaking of nested, interception of #NM in vmx_update_exception_bitmap() > is wrong > with respect to nested guests. Until XFD is supported for L2, which I didn't > see > in this series, #NM should not be intercepted while L2 is running. Can you remind what additional thing is required to support XFD for L2? If only about performance I prefer to the current conservative approach as the first step. As explained earlier, #NM should be rare if the guest doesn't run AMX applications at all. Adding nested into this picture doesn't make things a lot worser. > > For the earlier patch that introduced arch.trap_nm, if it's not too gross and > not > racy, the code could be: > > if (is_guest_mode(vcpu)) > eb |= get_vmcs12(vcpu)->exception_bitmap; > else { > ... > > if (vcpu->arch.guest_fpu.fpstate.xfd) > eb |= (1u << NM_VECTOR); > } > > Though I'm ok with a semi-temporary flag if that's gross/racy. > > Then this patch can change it to: > > if (is_guest_mode(vcpu)) > eb |= get_vmcs12(vcpu)->exception_bitmap; > else { > ... > > if (!vmx_test_msr_bitmap_write(vcpu->vmcs01.msr_bitmap)) > eb |= (1u << NM_VECTOR); > } > > > + fpu_sync_guest_vmexit_xfd_state(); > > + > > /* > > * Consume any pending interrupts, including the possible source of > > * VM-Exit on SVM and any ticks that occur between VM-Exit and > now. > > -- > > 2.27.0 > > Thanks Kevin