Re: [PATCH 2/3] livepatch: Avoid blocking tasklist_lock too long

Petr Mladek <pmladek@xxxxxxxx> · Thu, 13 Feb 2025 10:48:27 +0100

On Wed 2025-02-12 17:36:03, Josh Poimboeuf wrote:
> On Wed, Feb 12, 2025 at 04:42:39PM +0100, Petr Mladek wrote:
> > CPU1				CPU1
> > 
> > 				klp_try_complete_transition()
> > 
> > 
> > taskA:	
> >  + fork()
> >    + klp_copy_process()
> >       child->patch_state = KLP_PATCH_UNPATCHED
> > 
> > 				  klp_try_switch_task(taskA)
> > 				    // safe
> > 
> > 				child->patch_state = KLP_PATCH_PATCHED
> > 
> > 				all processes patched
> > 
> > 				klp_finish_transition()
> > 
> > 
> > 	list_add_tail_rcu(&p->thread_node,
> > 			  &p->signal->thread_head);
> > 
> > 
> > BANG: The forked task has KLP_PATCH_UNPATCHED so that
> >       klp_ftrace_handler() will redirect it to the old code.
> > 
> >       But CPU1 thinks that all tasks are migrated and is going
> >       to finish the transition
> 
> 
> Maybe klp_try_complete_transition() could iterate the tasks in two
> passes?  The first pass would use rcu_read_lock().  Then if all tasks
> appear to be patched, try again with tasklist_lock.
> 
> Or, we could do something completely different.  There's no need for
> klp_copy_process() to copy the parent's state: a newly forked task can
> be patched immediately because it has no stack.

Is this true, please?

If I get it correctly then copy_process() is used also by fork(2) where
the child continues from fork(2) call. I can't find it in the code
but I suppose that the child should use a copy of the parent's stack
in this case.

Or am I wrong?

Best Regards,
Petr