Commit-ID: e9d8c61557687b7126101e9550bdf243223f0d8f Gitweb: https://git.kernel.org/tip/e9d8c61557687b7126101e9550bdf243223f0d8f Author: Rik van Riel <riel@xxxxxxxxxxx> AuthorDate: Mon, 16 Jul 2018 15:03:37 -0400 Committer: Ingo Molnar <mingo@xxxxxxxxxx> CommitDate: Tue, 17 Jul 2018 09:35:34 +0200 x86/mm/tlb: Skip atomic operations for 'init_mm' in switch_mm_irqs_off() Song Liu noticed switch_mm_irqs_off() taking a lot of CPU time in recent kernels,using 1.8% of a 48 CPU system during a netperf to localhost run. Digging into the profile, we noticed that cpumask_clear_cpu and cpumask_set_cpu together take about half of the CPU time taken by switch_mm_irqs_off(). However, the CPUs running netperf end up switching back and forth between netperf and the idle task, which does not require changes to the mm_cpumask. Furthermore, the init_mm cpumask ends up being the most heavily contended one in the system. Simply skipping changes to mm_cpumask(&init_mm) reduces overhead. Reported-and-tested-by: Song Liu <songliubraving@xxxxxx> Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx> Acked-by: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: efault@xxxxxx Cc: kernel-team@xxxxxx Cc: luto@xxxxxxxxxx Link: http://lkml.kernel.org/r/20180716190337.26133-8-riel@xxxxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> --- arch/x86/mm/tlb.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 493559cae2d5..f086195f644c 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -310,15 +310,22 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, sync_current_stack_to_mm(next); } - /* Stop remote flushes for the previous mm */ - VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(real_prev)) && - real_prev != &init_mm); - cpumask_clear_cpu(cpu, mm_cpumask(real_prev)); + /* + * Stop remote flushes for the previous mm. + * Skip kernel threads; we never send init_mm TLB flushing IPIs, + * but the bitmap manipulation can cause cache line contention. + */ + if (real_prev != &init_mm) { + VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, + mm_cpumask(real_prev))); + cpumask_clear_cpu(cpu, mm_cpumask(real_prev)); + } /* * Start remote flushes and then read tlb_gen. */ - cpumask_set_cpu(cpu, mm_cpumask(next)); + if (next != &init_mm) + cpumask_set_cpu(cpu, mm_cpumask(next)); next_tlb_gen = atomic64_read(&next->context.tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html