On Mon, Aug 23, 2021, Alexander Shishkin wrote: > Regardless of the "pt_mode", the kvm driver installs its interrupt handler > for Intel PT, which always overrides the native handler, causing data loss > inside kvm guests, while we're expecting to trace them. > > Fix this by only installing kvm's perf_guest_cbs if pt_mode is set to > guest tracing. Uh, regardless of the correctness of such a change (spoiler alert), making an enormous leap from "one thing is wrong" to "nuke it all!" needs way more justfication/explanation. Or more realistically, such a leap should be a good indication that the proposed change is not correct. > Signed-off-by: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx> > Fixes: ff9d07a0e7ce7 ("KVM: Implement perf callbacks for guest sampling") This should be another clue that the fix isn't correct. That patch is from 2010, Intel PT was announced in 2013 and merged in 2019. > Reported-by: Artem Kashkanov <artem.kashkanov@xxxxxxxxx> > Tested-by: Artem Kashkanov <artem.kashkanov@xxxxxxxxx> > --- > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/vmx/vmx.c | 6 ++++++ > arch/x86/kvm/x86.c | 10 ++++++++-- > 3 files changed, 15 insertions(+), 2 deletions(-) > ... > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9b6bca616929..3ba0001e7388 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -268,6 +268,8 @@ static struct kmem_cache *x86_fpu_cache; > > static struct kmem_cache *x86_emulator_cache; > > +static int __read_mostly intel_pt_enabled; > + > /* > * When called, it means the previous get/set msr reached an invalid msr. > * Return true if we want to ignore/silent this failed msr access. > @@ -8194,7 +8196,10 @@ int kvm_arch_init(void *opaque) > > kvm_timer_init(); > > - perf_register_guest_info_callbacks(&kvm_guest_cbs); > + if (ops->intel_pt_enabled && ops->intel_pt_enabled()) r This is not remotely correct. vmx.c's "pt_mode", which is queried via this path, is modified by hardware_setup(), a.k.a. kvm_x86_ops.hardware_setup(), which runs _after_ this code. And as alluded to above, these are generic perf callbacks, installing them if and only if Intel PT is enabled in a specific mode completely breaks "regular" perf. I'll post a small series, there's a bit of code massage needed to fix this properly. The PMI handler can also be optimized to avoid a retpoline when PT is not exposed to the guest. > + perf_register_guest_info_callbacks(&kvm_guest_cbs); > + intel_pt_enabled = 1; > + } > > if (boot_cpu_has(X86_FEATURE_XSAVE)) { > host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK); > @@ -8229,7 +8234,8 @@ void kvm_arch_exit(void) > clear_hv_tscchange_cb(); > #endif > kvm_lapic_exit(); > - perf_unregister_guest_info_callbacks(&kvm_guest_cbs); > + if (intel_pt_enabled) > + perf_unregister_guest_info_callbacks(&kvm_guest_cbs); > > if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) > cpufreq_unregister_notifier(&kvmclock_cpufreq_notifier_block, > -- > 2.32.0 >