On 19/06/20 11:40, Vitaly Kuznetsov wrote: > Guest crashes are observed on a Cascade Lake system when 'perf top' is > launched on the host, e.g. > > BUG: unable to handle kernel paging request at fffffe0000073038 > PGD 7ffa7067 P4D 7ffa7067 PUD 7ffa6067 PMD 7ffa5067 PTE ffffffffff120 > Oops: 0000 [#1] SMP PTI > CPU: 1 PID: 1 Comm: systemd Not tainted 4.18.0+ #380 > ... > Call Trace: > serial8250_console_write+0xfe/0x1f0 > call_console_drivers.constprop.0+0x9d/0x120 > console_unlock+0x1ea/0x460 > > Call traces are different but the crash is imminent. The problem was > blindly bisected to the commit 041bc42ce2d0 ("KVM: VMX: Micro-optimize > vmexit time when not exposing PMU"). It was also confirmed that the > issue goes away if PMU is exposed to the guest. > > With some instrumentation of the guest we can see what is being switched > (when we do atomic_switch_perf_msrs()): > > vmx_vcpu_run: switching 2 msrs > vmx_vcpu_run: switching MSR38f guest: 70000000d host: 70000000f > vmx_vcpu_run: switching MSR3f1 guest: 0 host: 2 > > The current guess is that PEBS (MSR_IA32_PEBS_ENABLE, 0x3f1) is to blame. > Regardless of whether PMU is exposed to the guest or not, PEBS needs to > be disabled upon switch. > > This reverts commit 041bc42ce2d0efac3b85bbb81dea8c74b81f4ef9. > > Reported-by: Maxime Coquelin <maxime.coquelin@xxxxxxxxxx> > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > --- > - Perf/KVM interractions are a mystery to me, thus RFC. > --- > arch/x86/kvm/vmx/vmx.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 36c771728c8c..b1a23ad986ff 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -6728,8 +6728,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > > pt_guest_enter(vmx); > > - if (vcpu_to_pmu(vcpu)->version) > - atomic_switch_perf_msrs(vmx); > + atomic_switch_perf_msrs(vmx); > atomic_switch_umwait_control_msr(vmx); > > if (enable_preemption_timer) > Queued, thanks. Paolo