Re: [PATCH v7] livepatch: Clear relocation targets on a module removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 4, 2023 at 2:26 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> On Wed 2022-12-14 09:40:35, Song Liu wrote:
> > From: Miroslav Benes <mbenes@xxxxxxx>
> >
> > Josh reported a bug:
> >
> >   When the object to be patched is a module, and that module is
> >   rmmod'ed and reloaded, it fails to load with:
> >
> >   module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 2, loc 00000000ba0302e9, val ffffffffa03e293c
> >   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' (-8)
> >   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to load module 'nfsd'
> >
> >   The livepatch module has a relocation which references a symbol
> >   in the _previous_ loading of nfsd. When apply_relocate_add()
> >   tries to replace the old relocation with a new one, it sees that
> >   the previous one is nonzero and it errors out.
> >
> > We thus decided to reverse the relocation patching (clear all relocation
> > targets on x86_64). The solution is not
> > universal and is too much arch-specific, but it may prove to be simpler
> > in the end.
> >
> > --- a/arch/powerpc/kernel/module_64.c
> > +++ b/arch/powerpc/kernel/module_64.c
> > @@ -739,6 +739,67 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> >       return 0;
> >  }
> >
> > +#ifdef CONFIG_LIVEPATCH
> > +void clear_relocate_add(Elf64_Shdr *sechdrs,
> > +                    const char *strtab,
> > +                    unsigned int symindex,
> > +                    unsigned int relsec,
> > +                    struct module *me)
> > +{
> > +     unsigned int i;
> > +     Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> > +     Elf64_Sym *sym;
> > +     unsigned long *location;
> > +     const char *symname;
> > +     u32 *instruction;
> > +
> > +     pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> > +              sechdrs[relsec].sh_info);
> > +
> > +     for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> > +             location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> > +                     + rela[i].r_offset;
> > +             sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> > +                     + ELF64_R_SYM(rela[i].r_info);
> > +             symname = me->core_kallsyms.strtab
> > +                     + sym->st_name;
> > +
> > +             if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> > +                     continue;
>
> Is it OK to continue?
>
> IMHO, we should at least warn here. It means that the special elf
> section contains a relocation that we are not able to clear. It will
> most likely blow up when we try to load the livepatched module
> again.
>
> > +             /*
> > +              * reverse the operations in apply_relocate_add() for case
> > +              * R_PPC_REL24.
> > +              */
> > +             if (sym->st_shndx != SHN_UNDEF &&
> > +                 sym->st_shndx != SHN_LIVEPATCH)
> > +                     continue;
>
> Same here. IMHO, we should warn when the section contains something
> that we are not able to clear.
>
> > +             /* skip mprofile and ftrace calls, same as restore_r2() */
> > +             if (is_mprofile_ftrace_call(symname))
> > +                     continue;
>
> Is this correct? restore_r2() returns "1" in this case. As a result
> apply_relocate_add() returns immediately with -ENOEXEC. IMHO, we
> should print a warning and return as well.
>
> > +             instruction = (u32 *)location;
> > +             /* skip sibling call, same as restore_r2() */
> > +             if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> > +                     continue;
>
> Same here. restore_r2() returns '1' in this case...
>
> > +
> > +             instruction += 1;
> > +             /*
> > +              * Patch location + 1 back to NOP so the next
> > +              * apply_relocate_add() call (reload the module) will not
> > +              * fail the sanity check in restore_r2():
> > +              *
> > +              *         if (*instruction != PPC_RAW_NOP()) {
> > +              *             pr_err(...);
> > +              *             return 0;
> > +              *         }
> > +              */
> > +             patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> > +     }
>
> This seems incomplete. The above code reverts patch_instruction() called
> from restore_r2(). But there is another patch_instruction() called in
> apply_relocate_add() for case R_PPC_REL24. IMHO, we should revert this
> as well.
>
> > +}
> > +#endif
>
> IMHO, this approach is really bad. The function is not maintainable.
> It will be very hard to keep it in sync with apply_relocate_add().
> And all the mistakes are just a proof.

I don't really think the above are mistakes. This should be the same
as the version that passed Joe's tests. (I didn't test it myself).

>
> IMHO, the only sane way is to avoid the code duplication.

I think this falls back to the question that do we want
clear_relocate_add() to
   1) undo everything by apply_relocate_add();
or
   2) make sure the next apply_relocate_add() succeeds.

Current version does 2). If we want to share a lot of code
between apply_ and clear_, we need to go with 1). Do we
want something like:

                /* `Everything is relative'. */
                value = sym->st_value + rela[i].r_addend;
                if (!apply)
                        value = 0;

                switch (ELF64_R_TYPE(rela[i].r_info)) {
                case R_PPC64_ADDR32:
                        /* Simply set it */
                        *(u32 *)location = value;
                        break;

                case R_PPC64_ADDR64:
                        /* Simply set it */
                        *(unsigned long *)location = value;
                        break;

                case R_PPC64_TOC:
                       value = apply ? my_r2(sechdrs, me) : 0;
                        *(unsigned long *)location = value;
                        break;
... (a lot more).

Actually, since R_PPC64_ADDR32 etc. don't cause
the next apply_ to fail, we can make clear_ to the same
thing as apply_ (write the same value again).

These approaches don't look better to me. But I am ok
with any of them. Please just let me know which one is
most preferable:

a. current version;
b. clear_ undo everything of apply_ (the sample code
   above)
c. clear_ undo R_PPC_REL24, but _redo_ everything
   of apply_ for other ELF64_R_TYPEs. (should be
  clearer code than option b).

btw: undo the follow logic for R_PPC_REL24 alone is
not really easy (for me)

                case R_PPC_REL24:
                        /* FIXME: Handle weak symbols here --RR */
                        if (sym->st_shndx == SHN_UNDEF ||
                            sym->st_shndx == SHN_LIVEPATCH) {
                                /* External: go via stub */
                                value = stub_for_addr(sechdrs, value, me,
                                                strtab + sym->st_name);
                                if (!value)
                                        return -ENOENT;
                                if (!restore_r2(strtab + sym->st_name,
                                                        (u32
*)location + 1, me))
                                        return -ENOEXEC;
                        } else
                                value += local_entry_offset(sym);

                        /* Convert value to relative */
                        value -= (unsigned long)location;
                        if (value + 0x2000000 > 0x3ffffff || (value & 3) != 0){
                                pr_err("%s: REL24 %li out of range!\n",
                                       me->name, (long int)value);
                                return -ENOEXEC;
                        }

                        /* Only replace bits 2 through 26 */
                        value = (*(uint32_t *)location & ~PPC_LI_MASK)
| PPC_LI(value);

                        if (patch_instruction((u32 *)location, ppc_inst(value)))
                                return -EFAULT;

                        break;

Thanks,
Song



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux