Re: [RFC PATCH 2/2] livepatch: Implement livepatch hybrid mode

Yafang Shao <laoar.shao@xxxxxxxxx> · Sat, 8 Feb 2025 10:14:55 +0800

On Fri, Feb 7, 2025 at 5:36 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> On Fri 2025-02-07 11:16:45, Yafang Shao wrote:
> > On Fri, Feb 7, 2025 at 10:31 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > > On Mon, Jan 27, 2025 at 02:35:26PM +0800, Yafang Shao wrote:
> > > > - Temporary Loss of Patching
> > > >
> > > >   During the replacement process, the old patch is set to a NOP (no-operation)
> > > >   before the new patch is fully applied. This creates a window where the
> > > >   function temporarily reverts to its original, unpatched state. If the old
> > > >   patch fixed a critical issue (e.g., one that prevented a system panic), the
> > > >   system could become vulnerable to that issue during the transition.
> > >
> > > Are you saying that atomic replace is not atomic?  If so, this sounds
> > > like another bug.
> >
> > >From my understanding, there’s a window where the original function is
> > not patched.
>
> This is a misunderstanding.
>
> > klp_enable_patch
> > + klp_init_patch
> >    + if (patch->replace)
> >           klp_add_nops(patch);  <<<< set all old patches to nop
>
> 1. The "nop" entry is added into the _new_ (to-be-enabled) livepatch,
>    see klp_add_nops(patch). The parameter is the _newly_ enabled patch.
>
> 2. The "nop" entries are added only for functions which are currently
>    livepatched but they are not longer livepatched in the new
>    livepatch, see:
>
> static int klp_add_object_nops(struct klp_patch *patch,
>                                struct klp_object *old_obj)
> {
> [...]
>         klp_for_each_func(old_obj, old_func) {
>                 func = klp_find_func(obj, old_func);
>                 if (func)
>                         continue;       <------ Do not allocate nop
>                                                 when the fuction is
>                                                 implemeted in the new
>                                                 livepatch.
>
>                 func = klp_alloc_func_nop(old_func, obj);
>                 if (!func)
>                         return -ENOMEM;
>         }
>
>         return 0;
> }

Thanks for your explanation.

>
>
> > + __klp_enable_patch
> >    + klp_patch_object
> >       + klp_patch_func
> >          + ops = klp_find_ops(func->old_func);
> >             + if (ops)
> >                    // add the new patch to the func_stack list
> >                    list_add_rcu(&func->stack_node, &ops->func_stack);
> >
> >
> > klp_ftrace_handler
> > + func = list_first_or_null_rcu(&ops->func_stack, struct klp_func
>
> 3. You omitted this important part of the code:
>
>         if (unlikely(func->transition)) {
>                 patch_state = current->patch_state;
>                 if (patch_state == KLP_TRANSITION_UNPATCHED) {
>                         /*
> ---->                    * Use the previously patched version of the function.
> ---->                    * If no previous patches exist, continue with the
> ---->                    * original function.
>                          */
>                         func = list_entry_rcu(func->stack_node.next,
>                                               struct klp_func, stack_node);
>
>
>         The condition "patch_state == KLP_TRANSITION_UNPATCHED" might
>         be a bit misleading.
>
>         The state "KLP_TRANSITION_UNPATCHED" means that it can't use
>         the code from the "new" livepatch => it has to fallback
>         to the previously used code => previous livepatch.

Understood.

>
>
> > + if (func->nop)
> >        goto unlock;
> > + ftrace_regs_set_instruction_pointer(fregs, (unsigned long)func->new_func);
>
> > Before the new atomic replace patch is added to the func_stack list,
> > the old patch is already set to nop.
>       ^^^
>
>      The nops are set in the _new_ patch for functions which will
>      not longer get livepatched, see the commit e1452b607c48c642
>      ("livepatch: Add atomic replace") for more details.
>
> > If klp_ftrace_handler() is
> > triggered at this point, it will effectively do nothing—in other
> > words, it will execute the original function.
> > I might be wrong.
>
> Fortunately, you are wrong. This would be a serious violation of
> the consistency model and livepatches modifying some semantic would
> blow up systems.

That is great. Thanks for your help.

-- 
Regards
Yafang