Should the trylock succeed (and thus blocking was avoided), the routine wants to ensure blocking was still legal to do. However, the method used ends up calling __cond_resched injecting a voluntary preemption point with the freshly acquired lock. One can hack around it using __might_sleep instead of mere might_sleep, but since threads keep going off CPU here, I figured it is better to accomodate it. Drop the trylock, do the read lock which does the job prior to lock acquire. Found by checking off-CPU time during kernel build (like so: "offcputime-bpfcc -Ku"), sample backtrace: finish_task_switch.isra.0 __schedule __cond_resched lock_mm_and_find_vma do_user_addr_fault exc_page_fault asm_exc_page_fault - sh (4502) 10 Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx> --- mm/memory.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 1ec1ef3418bf..f31d5243272b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5257,12 +5257,6 @@ EXPORT_SYMBOL_GPL(handle_mm_fault); static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs) { - /* Even if this succeeds, make it clear we *might* have slept */ - if (likely(mmap_read_trylock(mm))) { - might_sleep(); - return true; - } - if (regs && !user_mode(regs)) { unsigned long ip = instruction_pointer(regs); if (!search_exception_tables(ip)) -- 2.39.2