Re: [PATCH v7] livepatch: Clear relocation targets on a module removal

Song Liu <song@xxxxxxxxxx> · Wed, 4 Jan 2023 21:59:13 -0800

On Wed, Jan 4, 2023 at 3:12 PM Joe Lawrence <joe.lawrence@xxxxxxxxxx> wrote:
>
> On Wed, Jan 04, 2023 at 09:34:25AM -0800, Song Liu wrote:
> > On Wed, Jan 4, 2023 at 2:26 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
> > >
> > > On Wed 2022-12-14 09:40:35, Song Liu wrote:
> > > > From: Miroslav Benes <mbenes@xxxxxxx>
> > > >
> > > > Josh reported a bug:
> > > >
> > > >   When the object to be patched is a module, and that module is
> > > >   rmmod'ed and reloaded, it fails to load with:
> > > >
> > > >   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> > > >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> > > >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> > > >
> > > >   The livepatch module has a relocation which references a symbol
> > > >   in the _previous_ loading of nfsd. When apply_relocate_add()
> > > >   tries to replace the old relocation with a new one, it sees that
> > > >   the previous one is nonzero and it errors out.
> > > >
> > > > We thus decided to reverse the relocation patching (clear all relocation
> > > > targets on x86_64). The solution is not
> > > > universal and is too much arch-specific, but it may prove to be simpler
> > > > in the end.
> > > >
> > > > --- a/arch/powerpc/kernel/module_64.c
> > > > +++ b/arch/powerpc/kernel/module_64.c
> > > > @@ -739,6 +739,67 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> > > >       return 0;
> > > >  }
> > > >
> > > > +#ifdef CONFIG_LIVEPATCH
> > > > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > > > +                    const char *strtab,
> > > > +                    unsigned int symindex,
> > > > +                    unsigned int relsec,
> > > > +                    struct module *me)
> > > > +{
> > > > +     unsigned int i;
> > > > +     Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > > > +     Elf64_Sym *sym;
> > > > +     unsigned long *location;
> > > > +     const char *symname;
> > > > +     u32 *instruction;
> > > > +
> > > > +     pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > > > +              sechdrs[relsec].sh_info);
> > > > +
> > > > +     for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > > > +             location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > > > +                     + rela[i].r_offset;
> > > > +             sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > > > +                     + ELF64_R_SYM(rela[i].r_info);
> > > > +             symname = me->core_kallsyms.strtab
> > > > +                     + sym->st_name;
> > > > +
> > > > +             if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> > > > +                     continue;
> > >
> > > Is it OK to continue?
> > >
> > > IMHO, we should at least warn here. It means that the special elf
> > > section contains a relocation that we are not able to clear. It will
> > > most likely blow up when we try to load the livepatched module
> > > again.
> > >
> > > > +             /*
> > > > +              * reverse the operations in apply_relocate_add() for case
> > > > +              * R_PPC_REL24.
> > > > +              */
> > > > +             if (sym->st_shndx != SHN_UNDEF &&
> > > > +                 sym->st_shndx != SHN_LIVEPATCH)
> > > > +                     continue;
> > >
> > > Same here. IMHO, we should warn when the section contains something
> > > that we are not able to clear.
> > >
> > > > +             /* skip mprofile and ftrace calls, same as restore_r2() */
> > > > +             if (is_mprofile_ftrace_call(symname))
> > > > +                     continue;
> > >
> > > Is this correct? restore_r2() returns "1" in this case. As a result
> > > apply_relocate_add() returns immediately with -ENOEXEC. IMHO, we
> > > should print a warning and return as well.
> > >
> > > > +             instruction = (u32 *)location;
> > > > +             /* skip sibling call, same as restore_r2() */
> > > > +             if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > > > +                     continue;
> > >
> > > Same here. restore_r2() returns '1' in this case...
> > >
> > > > +
> > > > +             instruction += 1;
> > > > +             /*
> > > > +              * Patch location + 1 back to NOP so the next
> > > > +              * apply_relocate_add() call (reload the module) will not
> > > > +              * fail the sanity check in restore_r2():
> > > > +              *
> > > > +              *         if (*instruction != PPC_RAW_NOP()) {
> > > > +              *             pr_err(...);
> > > > +              *             return 0;
> > > > +              *         }
> > > > +              */
> > > > +             patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> > > > +     }
> > >
> > > This seems incomplete. The above code reverts patch_instruction() called
> > > from restore_r2(). But there is another patch_instruction() called in
> > > apply_relocate_add() for case R_PPC_REL24. IMHO, we should revert this
> > > as well.
> > >
> > > > +}
> > > > +#endif
> > >
> > > IMHO, this approach is really bad. The function is not maintainable.
> > > It will be very hard to keep it in sync with apply_relocate_add().
> > > And all the mistakes are just a proof.
> >
> > I don't really think the above are mistakes. This should be the same
> > as the version that passed Joe's tests. (I didn't test it myself).
> >
> > >
> > > IMHO, the only sane way is to avoid the code duplication.
> >
> > I think this falls back to the question that do we want
> > clear_relocate_add() to
> >    1) undo everything by apply_relocate_add();
> > or
> >    2) make sure the next apply_relocate_add() succeeds.
> >
>
> This is a really good question and I think relates to your follow up
> question to my earlier reply, "What's the failure like if we don't
> handle R_PPC64_ADDR64 and R_PPC64_REL64?"
>
> If the code only needs to accomplish (2), then the incoming patch simply
> overwrites old relocation values.  If we prefer (1), then needs to do
> the full reversal on unload.
>
> Stepping back, this feature is definitely foot-gun capable.
> Kpatch-build would expect that klp-relocations would only ever be needed
> in code that will patch the very same module that provides the
> relocation destination -- that is, it was never intended to reference
> through one of these klp-relocations unless it resolved to a live
> module.
>
> On the other hand, when writing the selftests, verifying against NULL
> [1] provided 1) a quick sanity check that something was "cleared" and 2)
> protected the machine against said foot-gun.
>
> [1] https://github.com/joe-lawrence/klp-convert-tree/commit/643acbb8f4c0240030b45b64a542d126370d3e6c

I don't quite follow the foot-gun here. What's the failure mode?

[...]

> > These approaches don't look better to me. But I am ok
> > with any of them. Please just let me know which one is
> > most preferable:
> >
> > a. current version;
> > b. clear_ undo everything of apply_ (the sample code
> >    above)
> > c. clear_ undo R_PPC_REL24, but _redo_ everything
> >    of apply_ for other ELF64_R_TYPEs. (should be
> >   clearer code than option b).
> >
>
> This was my attempt at combining and slightly refactoring the power64
> version.  There is so much going on here I was tempted to split off it
> into separate value assignment and write functions.  Some changes I
> liked, but I wasn't all too happy with the result.  Also, as you
> mention, completely undoing R_PPC_REL24 is less than trivial... for this
> arch, there are basically three major tasks:
>
>   1) calculate the new value, including range checking
>   2) special constructs created by restore_r2 / create_stub
>   3) writing out the value
>
> and many cases are similar, but subtly different enough to avoid easy
> code consolidation.

Thanks for exploring this direction. I guess this part won't be perfect
anyway.

PS: While we discuss a solution for ppc64, how about we ship the
fix for other archs first? I think there are only a few small things to
be addressed.

Song

>
> static int write_relocate_add(Elf64_Shdr *sechdrs,
>                    const char *strtab,
>                    unsigned int symindex,
>                    unsigned int relsec,
>                    struct module *me,
>                    bool apply)
> {
>         unsigned int i;
>         Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
>         Elf64_Sym *sym;
>         unsigned long *location;
>         unsigned long value;
>

[...]