On Tue, Jan 3, 2023 at 2:39 PM Joe Lawrence <joe.lawrence@xxxxxxxxxx> wrote: > [...] > >> > >> Signed-off-by: Miroslav Benes <mbenes@xxxxxxx> > >> Signed-off-by: Song Liu <song@xxxxxxxxxx> > >> Reported-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> > >> Tested-by: Joe Lawrence <joe.lawrence@xxxxxxxxxx> # x86_64, s390x, ppc64le > > Since there is still some work required for ppc64le and possibly s390x, > let's strip the tested-by tag. Each version should be re-tested and > then we can let the maintainer add it on the final version. Sure. Will remove the tag in later version(s). [...] > >> +#ifdef CONFIG_LIVEPATCH > >> +void clear_relocate_add(Elf64_Shdr *sechdrs, > >> + const char *strtab, > >> + unsigned int symindex, > >> + unsigned int relsec, > >> + struct module *me) > >> +{ > >> + unsigned int i; > >> + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr; > >> + Elf64_Sym *sym; > >> + unsigned long *location; > >> + const char *symname; > >> + u32 *instruction; > >> + > >> + pr_debug("Clearing ADD relocate section %u to %u\n", relsec, > >> + sechdrs[relsec].sh_info); > >> + > >> + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) { > >> + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr > >> + + rela[i].r_offset; > >> + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr > >> + + ELF64_R_SYM(rela[i].r_info); > >> + symname = me->core_kallsyms.strtab > >> + + sym->st_name; > >> + > >> + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24) > >> + continue; > > Ppc64le will need to handle additional relocation types. > > While debugging a related issue on ppc64le regarding > CONFIG_STRICT_MODULE_RWX [3], these were the extent of the > klp-relocation types generated by kpatch-build and klp-convert-tree: > > - R_PPC64_REL24 to symbols in other .text sections > - R_PPC64_ADDR64 to symbols thru .TOC > - R_PPC64_REL64 to static key symbols > > I believe R_PPC64_ADDR64 and R_PPC64_REL64 can be simply be reset to 0. > > [3] https://github.com/linuxppc/issues/issues/375#issuecomment-1233698835 Hmm.. I don't quite follow why the two are related.. What's the failure like if we don't handle R_PPC64_ADDR64 and R_PPC64_REL64? [...] > >> #else /*X86_64*/ > >> -static int __apply_relocate_add(Elf64_Shdr *sechdrs, > >> +static int __write_relocate_add(Elf64_Shdr *sechdrs, > >> const char *strtab, > >> unsigned int symindex, > >> unsigned int relsec, > >> struct module *me, > >> - void *(*write)(void *dest, const void *src, size_t len)) > >> + void *(*write)(void *dest, const void *src, size_t len), > >> + bool apply) > > Aside: I do prefer the style of one function to handle applying/clearing > of relocations. x86_64 isn't too bad, but other arches have a much > richer set of relocations that do all sorts of relative/offset/TOC/etc > gymnastics that keeping their code in one spot should be much more > maintainable. I think this pattern will be pretty hairy for ppc64le? > > >> { > >> unsigned int i; > >> Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr; > >> Elf64_Sym *sym; > >> void *loc; > >> u64 val; > >> + u64 zero = 0ULL; > >> > >> DEBUGP("Applying relocate section %u to %u\n", > >> relsec, sechdrs[relsec].sh_info); > > How about keying off the apply bool to display "Applying" vs "Clearing". Great catch! > > >> @@ -163,40 +165,60 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs, > >> case R_X86_64_NONE: > >> break; > >> case R_X86_64_64: > >> - if (*(u64 *)loc != 0) > >> - goto invalid_relocation; > >> - write(loc, &val, 8); > >> + if (apply) { > >> + if (*(u64 *)loc != 0) > >> + goto invalid_relocation; > >> + write(loc, &val, 8); > >> + } else { > >> + write(loc, &zero, 8); > >> + } > >> break; > >> case R_X86_64_32: > >> - if (*(u32 *)loc != 0) > >> - goto invalid_relocation; > >> - write(loc, &val, 4); > >> - if (val != *(u32 *)loc) > >> - goto overflow; > >> + if (apply) { > >> + if (*(u32 *)loc != 0) > >> + goto invalid_relocation; > >> + write(loc, &val, 4); > >> + if (val != *(u32 *)loc) > >> + goto overflow; > >> + } else { > >> + write(loc, &zero, 4); > >> + } > >> break; > >> case R_X86_64_32S: > >> - if (*(s32 *)loc != 0) > >> - goto invalid_relocation; > >> - write(loc, &val, 4); > >> - if ((s64)val != *(s32 *)loc) > >> - goto overflow; > >> + if (apply) { > >> + if (*(s32 *)loc != 0) > >> + goto invalid_relocation; > >> + write(loc, &val, 4); > >> + if ((s64)val != *(s32 *)loc) > >> + goto overflow; > >> + } else { > >> + write(loc, &zero, 4); > >> + } > >> break; > >> case R_X86_64_PC32: > >> case R_X86_64_PLT32: > >> - if (*(u32 *)loc != 0) > >> - goto invalid_relocation; > >> - val -= (u64)loc; > >> - write(loc, &val, 4); > >> + if (apply) { > >> + if (*(u32 *)loc != 0) > >> + goto invalid_relocation; > >> + val -= (u64)loc; > >> + write(loc, &val, 4); > >> #if 0 > >> - if ((s64)val != *(s32 *)loc) > >> - goto overflow; > >> + if ((s64)val != *(s32 *)loc) > >> + goto overflow; > >> #endif > > Btw, This has been #if 0'd for so long I wonder if we should just remove it? > > >> + } else { > >> + write(loc, &zero, 4); > >> + } > >> break; > >> case R_X86_64_PC64: > >> - if (*(u64 *)loc != 0) > >> - goto invalid_relocation; > >> - val -= (u64)loc; > >> - write(loc, &val, 8); > >> + if (apply) { > >> + if (*(u64 *)loc != 0) > >> + goto invalid_relocation; > >> + val -= (u64)loc; > >> + write(loc, &val, 8); > >> + } else { > >> + write(loc, &zero, 8); > >> + } > > In my branch [2] ("(song, x86_64 suggestions) livepatch: Clear > relocation targets on a module removal"), I experimented with reducing > these cases down into two steps: compute the new val and then set the > location. > > Having back-to-back relocation case blocks wasn't ideal, but it does > reduce code a bit: > > For step 1: > - combine the relative relocation assignment > - consider !apply to be val of 0, drop the zero variable > > For step 2: > - drop the if (apply) conditional, just write the new val > > For at least this arch, I think it came out OK. Summarized here for > reference: > > /* Calculate value (or zero if clearing) */ > if (apply) { > val = sym->st_value + rel[i].r_addend; > > switch (ELF64_R_TYPE(rel[i].r_info)) { > case R_X86_64_PC32: > case R_X86_64_PLT32: > case R_X86_64_PC64: > val -= (u64)loc; > break; > } > } else { > val = 0ULL; > } > > /* Apply/clear relocation value */ > switch (ELF64_R_TYPE(rel[i].r_info)) { > case R_X86_64_NONE: > break; > case R_X86_64_64: > if (apply && *(u64 *)loc != 0) > goto invalid_relocation; > write(loc, &val, 8); > break; > case R_X86_64_32: > if (apply && *(u32 *)loc != 0) > goto invalid_relocation; > write(loc, &val, 4); > if (val != *(u32 *)loc) > goto overflow; > break; > [ ... etc ... ] This version looks good to me. > > > That said, things got hairy really fast when I tried applying a similar > pattern to ppc64le code, so maybe this model wouldn't help other arches > so much. (I haven't looked at s390x yet.) ppc64le seems to have totally different pattern. > > I'm not married to this approach, but thought I'd mention it as it > helped me tease apart these long apply_relocation() functions. > > >> break; > >> default: > >> pr_err("%s: Unknown rela relocation: %llu\n", > >> @@ -219,11 +241,12 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs, > >> return -ENOEXEC; > >> } [...] > >> > > > > I will try to get around to testing s390x sometime soon and report back > on how that works out. > Thanks! Song