On Wednesday, September 5, 2018 10:28:19 AM CEST Qiuxu Zhuo wrote: > Current NMI mechanism is to process all the handlers for each NMI. > Because perf uses NMI, so GHES NMI handler runs unnecessarily for > every perf NMI handling. This will be captured by PMU's PEBS (Precise > Event Based Sampling) and disturb perf result. > > GHES NMIs are very rare because they are only used in extreme error > situations or very frequent when machine is dying and error floods > happen. So add a GHES NMI nice level via GHES platform device sysfs > if it's > 0 and any other NMI (e.g. PMU NMI) has been handled for > current NMI, then skip current GHES NMI handler. So next PMU NMI can > be processed early and perf result is not distrubed by GHES NMI handler. > > We reply statistically on the property that GHES NMIs are unlikely > to collide with perf NMIs, or they are frequent there will be enough > of them that it doesn't matter. It's a heuristics that is not 100% > correct, but a reasonable one, and it saves a lot of unnecessary > work for every NMI. > > Test machines have HEST ACPI table installed and NMI notification > set, test cmds are 'perf mem record -a sleep 1' and 'perf mem report'. > > Before applying patch (perf memory profile): > > On Intel Broadwell-4S: > 0.63% 1 17910 LFB or LFB hit [k] intel_pstate_update_util > 0.59% 1 16960 LFB or LFB hit [k] intel_pstate_update_util > ... > 0.30% 1 8722 L1 or L1 hit [k] ghes_notify_nmi > > On Intel Skylake-4S: > 3.45% 1 20218 L1 hit [k] native_read_msr > 1.21% 1 7078 LFB hit [k] intel_pstate_update_util > ... > 1.21% 1 7077 N/A miss [k] ghes_notify_nmi > > After applying patch and 'echo 1 > /sys/devices/platform/GHES.[0-9]*/nmi_nice': > No GHES was showed up in perf memory profile. > > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx> > Suggested-by: Ying Huang <ying.huang@xxxxxxxxx> > Reported-by: Andi Kleen <andi.kleen@xxxxxxxxx> Unless this has been applied already, can you CC the entire series to linux-acpi, please? Thanks, Rafael