From: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> In a micro-benchmark that stresses the pagefault path, the down_read_trylock on the mmap_sem showed up quite high on the profile. Turns out this lock is bouncing between cpus quite a bit and thus is cache-cold a lot. This patch prefetches the lock (for write) as early as possible (and before some other somewhat expensive operations). With this patch, the down_read_trylock basically fell out of the top of profile. Signed-off-by: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> Signed-off-by: Andi Kleen <ak@xxxxxxx> --- arch/x86_64/mm/fault.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) Index: linux/arch/x86_64/mm/fault.c =================================================================== --- linux.orig/arch/x86_64/mm/fault.c +++ linux/arch/x86_64/mm/fault.c @@ -314,11 +314,13 @@ asmlinkage void __kprobes do_page_fault( unsigned long flags; siginfo_t info; + tsk = current; + mm = tsk->mm; + prefetchw(&mm->mmap_sem); + /* get the address */ __asm__("movq %%cr2,%0":"=r" (address)); - tsk = current; - mm = tsk->mm; info.si_code = SEGV_MAPERR; - : send the line "unsubscribe linux-x86_64" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html