From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx> commit e4557c1a46b0d32746bd309e1941914b5a6912b4 upstream. If a user first sample a PEBS event on a fixed counter, then sample a non-PEBS event on the same fixed counter on Icelake, it will trigger spurious NMI. For example: perf record -e 'cycles:p' -a perf record -e 'cycles' -a The error message for spurious NMI: [June 21 15:38] Uhhuh. NMI received for unknown reason 30 on CPU 2. [ +0.000000] Do you have a strange power saving mode enabled? [ +0.000000] Dazed and confused, but trying to continue The bug was introduced by the following commit: commit 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") The commit moves the intel_pmu_pebs_disable() after intel_pmu_disable_fixed(), which returns immediately. The related bit of PEBS_ENABLE MSR will never be cleared for the fixed counter. Then a non-PEBS event runs on the fixed counter, but the bit on PEBS_ENABLE is still set, which triggers spurious NMIs. Check and disable PEBS for fixed counters after intel_pmu_disable_fixed(). Reported-by: Yi, Ammy <ammy.yi@xxxxxxxxx> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx> Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> Cc: Jiri Olsa <jolsa@xxxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Stephane Eranian <eranian@xxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Vince Weaver <vincent.weaver@xxxxxxxxx> Fixes: 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") Link: https://lkml.kernel.org/r/20190625142135.22112-1-kan.liang@xxxxxxxxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- arch/x86/events/intel/core.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2092,12 +2092,10 @@ static void intel_pmu_disable_event(stru cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx); cpuc->intel_cp_status &= ~(1ull << hwc->idx); - if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { + if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) intel_pmu_disable_fixed(hwc); - return; - } - - x86_pmu_disable_event(event); + else + x86_pmu_disable_event(event); /* * Needs to be called after x86_pmu_disable_event,