On Thu, Aug 25, 2022, Xiaoyao Li wrote: > On 8/25/2022 11:34 PM, Sean Christopherson wrote: > > On Thu, Aug 25, 2022, Xiaoyao Li wrote: > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > index d7f8331d6f7e..3e9ce8f600d2 100644 > > > --- a/arch/x86/kvm/vmx/vmx.c > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > @@ -38,6 +38,7 @@ > > > #include <asm/fpu/api.h> > > > #include <asm/fpu/xstate.h> > > > #include <asm/idtentry.h> > > > +#include <asm/intel_pt.h> > > > #include <asm/io.h> > > > #include <asm/irq_remapping.h> > > > #include <asm/kexec.h> > > > @@ -1128,13 +1129,19 @@ static void pt_guest_enter(struct vcpu_vmx *vmx) > > > if (vmx_pt_mode_is_system()) > > > return; > > > + /* > > > + * Stop Intel PT on host to avoid vm-entry failure since > > > + * VM_ENTRY_LOAD_IA32_RTIT_CTL is set > > > + */ > > > + intel_pt_stop(); > > > + > > > /* > > > * GUEST_IA32_RTIT_CTL is already set in the VMCS. > > > * Save host state before VM entry. > > > */ > > > rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); > > > > KVM's manual save/restore of MSR_IA32_RTIT_CTL should be dropped. > > No. It cannot. Please see below. > > > If PT/RTIT can > > trace post-VMXON, then intel_pt_stop() will disable tracing and intel_pt_resume() > > will restore the host's desired value. > > intel_pt_stop() and intel_pt_resume() touches host's RTIT_CTL only when host > enables/uses Intel PT. Otherwise, they're just noop. In this case, we cannot > assume host's RTIT_CTL is zero (only the RTIT_CTL.TraceEn is 0). After > VM-exit, RTIT_CTL is cleared, we need to restore it. But ensuring the RTIT_CTL.TraceEn=0 is all that's needed to make VM-Entry happy, and if the host isn't using Intel PT, what do we care if other bits that, for all intents and purposes are ignored, are lost across VM-Entry/VM-Exit? I gotta imaging the perf will fully initialize RTIT_CTL if it starts using PT. Actually, if the host isn't actively using Intel PT, can KVM avoid saving the other RTIT MSRs? Even better, can we hand that off to perf? I really dislike KVM making assumptions about perf's internal behavior. E.g. can this be made to look like intel_pt_guest_enter(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN); and intel_pt_guest_exit(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN); > > > if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) { > > > - wrmsrl(MSR_IA32_RTIT_CTL, 0); > > > + /* intel_pt_stop() ensures RTIT_CTL.TraceEn is zero */ > > > pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.num_address_ranges); > > > > Isn't this at risk of the same corruption? What prevents a PT NMI that arrives > > after this point from changing other RTIT MSRs, thus causing KVM to restore the > > wrong values? > > intel_pt_stop() -> pt_event_stop() will do > > WRITE_ONCE(pt->handle_nmi, 0); > > which ensure PT NMI handler as noop that at the beginning of > intel_pt_interrupt(): > > if (!READ_ONCE(pt->handle_nmi)) > return; Ah, right. > > > > pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.num_address_ranges); > > > } > > > @@ -1156,6 +1163,8 @@ static void pt_guest_exit(struct vcpu_vmx *vmx) > > > */ > > > if (vmx->pt_desc.host.ctl) > > > wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); > > > + > > > + intel_pt_resume(); > > > } > > > void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel, > > > -- > > > 2.27.0 > > > >