On Fri, Feb 14, 2025 at 10:44:59AM +0800, Yafang Shao wrote: > The longest duration of klp_try_complete_transition() ranges from 8.5 > to 17.2 seconds. > > It appears that the RCU stall is not only driven by num_processes * > average_klp_try_switch_task, but also by contention within > klp_try_complete_transition(), particularly around the tasklist_lock. > Interestingly, even after replacing "read_lock(&tasklist_lock)" with > "rcu_read_lock()", the RCU stall persists. My verification shows that > the only way to prevent the stall is by checking need_resched() during > each iteration of the loop. I'm confused... rcu_read_lock() shouldn't cause any contention, right? So if klp_try_switch_task() isn't the problem, then what is? I wonder if those function timings might be misleading. If klp_try_complete_transition() gets preempted immediately when it releases the lock, it could take a while before it eventually returns. So that funclatency might not be telling the whole story. Though 8.5 - 17.2 seconds is a bit excessive... -- Josh