It was found that when the perf-record command was running on large system with many cores and GHES enabled, the performance of the workload being monitored will slow down significantly. For example, # ./hackbench Time: 0.047 # perf record ./hackbench Time: 21.392 The big slowdown was traced to the fact that too much time (more than 1ms) was spent in each invocation of ghes_notify_nmi(). The perf-record command will cause NMI to be generated periodically at high frequency. If many cores are doing that, it will cause a long queue to form in the global ghes_nmi_lock lock. This patch limits only up to 2 cpus to be working on ghes_notify_nmi() at any given instance. This will significantly reduce the time spent by each CPU in ghes_notify_nmi(). With that patch, the perf slowdown was significantly reduced. # perf record ./hackbench Time: 0.133 Signed-off-by: Waiman Long <Waiman.Long@xxxxxx> --- drivers/acpi/apei/ghes.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index e82d097..132a014 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -803,6 +803,20 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs) int sev, sev_global = -1; int ret = NMI_DONE; + /* # of CPUs processing NMI */ + static atomic_t ghes_nmi_cnt = ATOMIC_INIT(0); + + /* + * Multiple CPUs may get NMI and come here to check GHES. If one has + * gotten the lock and is doing the check, there is no point in having + * more than one other CPU waiting just in case the active CPU has + * past the point where the NMI was generated. So we set a limit of + * only allowing 2 CPUs working here. The rests can just return. + */ + if (atomic_add_return(1, &ghes_nmi_cnt) > 2) { + atomic_dec(&ghes_nmi_cnt); + return ret; + } raw_spin_lock(&ghes_nmi_lock); list_for_each_entry_rcu(ghes, &ghes_nmi, list) { if (ghes_read_estatus(ghes, 1)) { @@ -863,6 +877,7 @@ next: #endif out: + atomic_dec(&ghes_nmi_cnt); raw_spin_unlock(&ghes_nmi_lock); return ret; } -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html