On Fri, Feb 14, 2025 at 03:44:10PM +0100, Petr Mladek wrote: > I guess that we really could consider the new task as migrated > and clear TIF_PATCH_PENDING. > > But we can't set child->patch_state to KLP_TRANSITION_IDLE. It won't > work when the transition gets reverted. [*] Hm, why not? I don't see any difference between patching or unpatching? klp_init_transition() has a barrier to enforce the order of the klp_target_state and func->transition writes, as read by the klp_ftrace_handler(). So in the ftrace handler, if func->transition is set and the task is KLP_TRANSITION_IDLE, it can use klp_target_state to decide which function to use. Its patch state would effectively be the same as any other already-transitioned task, whether it's patching or unpatching. Then in klp_complete_transition(), after func->transition gets set to false, klp_synchronize_transition() flushes out any running ftrace handlers. From that point on, func->transition is false, so the ftrace handler would no longer read klp_target_state. > [*] I gave this few brain cycles but I did not find any elegant > way how to set this a safe way and allow using rcu_read_lock() > in klp_try_complete_transition(). > > It might be because it is Friday evening and I am leaving for > a trip tomorrow. Also I not motivated enough to think about it > because Yafang saw the RCU stall even with that rcu_read_lock(). > So I send this just for record. Even if it doesn't fix the RCU stalls, I think we should still try to avoid holding the tasklist_lock. It's a global lock which can be contended, and we want the livepatch transition to be as unobtrusive as possible. If the system is doing a lot of forking across many CPUs, holding the lock could block all the forking tasks and trigger system-wide scheduling latencies. And that could be compounded by the unnecessary transitioning of new tasks every time the delayed work runs. -- Josh