Re: [PATCH 2/3] livepatch: Avoid blocking tasklist_lock too long

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 12, 2025 at 8:40 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>
> On Tue, Feb 11, 2025 at 02:24:36PM +0800, Yafang Shao wrote:
> >  void klp_try_complete_transition(void)
> >  {
> > +     unsigned long timeout, proceed_pending_processes;
> >       unsigned int cpu;
> >       struct task_struct *g, *task;
> >       struct klp_patch *patch;
> > @@ -467,9 +468,30 @@ void klp_try_complete_transition(void)
> >        * unless the patch includes changes to a very common function.
> >        */
> >       read_lock(&tasklist_lock);
> > -     for_each_process_thread(g, task)
> > +     timeout = jiffies + HZ;
> > +     proceed_pending_processes = 0;
> > +     for_each_process_thread(g, task) {
> > +             /* check if this task has already switched over */
> > +             if (task->patch_state == klp_target_state)
> > +                     continue;
> > +
> > +             proceed_pending_processes++;
> > +
> >               if (!klp_try_switch_task(task))
> >                       complete = false;
> > +
> > +             /*
> > +              * Prevent hardlockup by not blocking tasklist_lock for too long.
> > +              * But guarantee the forward progress by making sure at least
> > +              * some pending processes were checked.
> > +              */
> > +             if (rwlock_is_contended(&tasklist_lock) &&
> > +                 time_after(jiffies, timeout) &&
> > +                 proceed_pending_processes > 100) {
> > +                     complete = false;
> > +                     break;
> > +             }
> > +     }
> >       read_unlock(&tasklist_lock);
>
> Instead of all this can we not just use rcu_read_lock() instead of
> tasklist_lock?

I’m concerned that there’s a potential race condition in fork() if we
use RCU, as illustrated below:

  CPU0                                                     CPU1

write_lock_irq(&tasklist_lock);
klp_copy_process(p);

                                         parent->patch_state=klp_target_state

list_add_tail_rcu(&p->tasks, &init_task.tasks);
write_unlock_irq(&tasklist_lock);

In this scenario, after the parent executes klp_copy_process(p) to
copy its patch_state to the child, but before adding the child to the
task list, the parent’s patch_state might be updated by the KLP
transition. This could result in the child being left with an outdated
state.

We need to ensure that klp_copy_process() and list_add_tail_rcu() are
treated as a single atomic operation.

--
Regards



Yafang





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux