On Tue, Feb 11, 2025 at 02:24:36PM +0800, Yafang Shao wrote: > void klp_try_complete_transition(void) > { > + unsigned long timeout, proceed_pending_processes; > unsigned int cpu; > struct task_struct *g, *task; > struct klp_patch *patch; > @@ -467,9 +468,30 @@ void klp_try_complete_transition(void) > * unless the patch includes changes to a very common function. > */ > read_lock(&tasklist_lock); > - for_each_process_thread(g, task) > + timeout = jiffies + HZ; > + proceed_pending_processes = 0; > + for_each_process_thread(g, task) { > + /* check if this task has already switched over */ > + if (task->patch_state == klp_target_state) > + continue; > + > + proceed_pending_processes++; > + > if (!klp_try_switch_task(task)) > complete = false; > + > + /* > + * Prevent hardlockup by not blocking tasklist_lock for too long. > + * But guarantee the forward progress by making sure at least > + * some pending processes were checked. > + */ > + if (rwlock_is_contended(&tasklist_lock) && > + time_after(jiffies, timeout) && > + proceed_pending_processes > 100) { > + complete = false; > + break; > + } > + } > read_unlock(&tasklist_lock); Instead of all this can we not just use rcu_read_lock() instead of tasklist_lock? Petr, I know you mentioned that would widen the race window for the do_exit() path, but don't we need to fix that race anyway? -- Josh