[PATCH] mm: remove unintentional voluntary preemption in get_mmap_lock_carefully

Mateusz Guzik <mjguzik@xxxxxxxxx> · Sun, 20 Aug 2023 12:43:03 +0200

Should the trylock succeed (and thus blocking was avoided), the routine
wants to ensure blocking was still legal to do. However, the method
used ends up calling __cond_resched injecting a voluntary preemption
point with the freshly acquired lock.

One can hack around it using __might_sleep instead of mere might_sleep,
but since threads keep going off CPU here, I figured it is better to
accomodate it.

Drop the trylock, do the read lock which does the job prior to lock
acquire.

Found by checking off-CPU time during kernel build (like so:
"offcputime-bpfcc -Ku"), sample backtrace:
    finish_task_switch.isra.0
    __schedule
    __cond_resched
    lock_mm_and_find_vma
    do_user_addr_fault
    exc_page_fault
    asm_exc_page_fault
    -                sh (4502)
        10

Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>
---
 mm/memory.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 1ec1ef3418bf..f31d5243272b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5257,12 +5257,6 @@ EXPORT_SYMBOL_GPL(handle_mm_fault);
 
 static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
 {
-	/* Even if this succeeds, make it clear we *might* have slept */
-	if (likely(mmap_read_trylock(mm))) {
-		might_sleep();
-		return true;
-	}
-
 	if (regs && !user_mode(regs)) {
 		unsigned long ip = instruction_pointer(regs);
 		if (!search_exception_tables(ip))
-- 
2.39.2