On Tue, May 23, 2023 at 6:50 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > On Mon, 22 May 2023 at 20:03, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: > > > > + linux-arm-kernel > > > > On Sun, May 21, 2023 at 9:05 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote: > > > > > > ARM defconfig misses to detect some section mismatches. > > > > > > [test code] > > > > > > #include <linux/init.h> > > > > > > int __initdata foo; > > > int get_foo(int x) { return foo; } > > > > > > It is apparently a bad reference, but modpost does not report anything > > > for ARM defconfig (i.e. multi_v7_defconfig). > > > > > > The test code above produces the following relocations. > > > > > > Relocation section '.rel.text' at offset 0x200 contains 2 entries: > > > Offset Info Type Sym.Value Sym. Name > > > 00000000 0000062b R_ARM_MOVW_ABS_NC 00000000 .LANCHOR0 > > > 00000004 0000062c R_ARM_MOVT_ABS 00000000 .LANCHOR0 > > > > > > Relocation section '.rel.ARM.exidx' at offset 0x210 contains 2 entries: > > > Offset Info Type Sym.Value Sym. Name > > > 00000000 0000022a R_ARM_PREL31 00000000 .text > > > 00000000 00001000 R_ARM_NONE 00000000 __aeabi_unwind_cpp_pr0 > > > > > > Currently, R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS are just skipped. > > > > > > Add code to handle them. I checked arch/arm/kernel/module.c to learn > > > how the offset is encoded in the instruction. > > > > > > The referenced symbol in relocation might be a local anchor. > > > If is_valid_name() returns false, let's search for a better symbol name. > > > > > > Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx> > > > --- > > > > > > scripts/mod/modpost.c | 12 ++++++++++-- > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c > > > index 34fbbd85bfde..ed2301e951a9 100644 > > > --- a/scripts/mod/modpost.c > > > +++ b/scripts/mod/modpost.c > > > @@ -1108,7 +1108,7 @@ static inline int is_valid_name(struct elf_info *elf, Elf_Sym *sym) > > > /** > > > * Find symbol based on relocation record info. > > > * In some cases the symbol supplied is a valid symbol so > > > - * return refsym. If st_name != 0 we assume this is a valid symbol. > > > + * return refsym. If is_valid_name() == true, we assume this is a valid symbol. > > > * In other cases the symbol needs to be looked up in the symbol table > > > * based on section and address. > > > * **/ > > > @@ -1121,7 +1121,7 @@ static Elf_Sym *find_tosym(struct elf_info *elf, Elf64_Sword addr, > > > Elf64_Sword d; > > > unsigned int relsym_secindex; > > > > > > - if (relsym->st_name != 0) > > > + if (is_valid_name(elf, relsym)) > > > return relsym; > > > > > > /* > > > @@ -1312,11 +1312,19 @@ static int addend_arm_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r) > > > unsigned int r_typ = ELF_R_TYPE(r->r_info); > > > Elf_Sym *sym = elf->symtab_start + ELF_R_SYM(r->r_info); > > > unsigned int inst = TO_NATIVE(*reloc_location(elf, sechdr, r)); > > > + int offset; > > > > > > switch (r_typ) { > > > case R_ARM_ABS32: > > > r->r_addend = inst + sym->st_value; > > > break; > > > + case R_ARM_MOVW_ABS_NC: > > > + case R_ARM_MOVT_ABS: > > > + offset = ((inst & 0xf0000) >> 4) | (inst & 0xfff); > > > + offset = (offset ^ 0x8000) - 0x8000; > > > > The code in arch/arm/kernel/module.c then right shifts the offset by > > 16 for R_ARM_MOVT_ABS. Is that necessary? > > > > MOVW/MOVT pairs are limited to an addend of -/+ 32 KiB, and the same > value must be encoded in both instructions. In my understanding, 'movt' loads the immediate value to the upper 16-bit of the register. I am just curious about the code in arch/arm/kernel/module.c. Please see 'case R_ARM_MOVT_ABS:' part. [1] 'offset' is the immediate value encoded in instruction [2] Add sym->st_value [3] Right-shift 'offset' by 16 [4] Write it back to the instruction So, the immediate value encoded in the instruction is divided by 65536. I guess we need something like the following? (left-shift by 16). if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS || ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL) offset <<= 16; > > When constructing the actual immediate value from the symbol value and > the addend, only the top 16 bits are used in MOVT and the bottom 16 > bits in MOVW. > > However, this code seems to borrow the Elf_Rela::addend field (which > ARM does not use natively) to record the intermediate value, which > would need to be split if it is used to fix up instruction opcodes. At first, modpost supported only RELA for section mismatch checks. Later, 2c1a51f39d95 ("[PATCH] kbuild: check SHT_REL sections") added REL support. But, the common code still used Elf_Rela. modpost does not need to write back the fixed instruction. modpost is only interested in the offset address. Currently, modpost saves the offset address in r->r_offset even for Rel. I do not like this code. So, I am trying to reduce the use of Elf_Rela. For example, this patch. https://patchwork.kernel.org/project/linux-kbuild/patch/20230521160426.1881124-8-masahiroy@xxxxxxxxxx/ > Btw the Thumb2 encodings of MOVT and MOVW seem to be missing here. Right, if CONFIG_THUMB2_KERNEL=y, section mismatch check. Several relocation types are just skipped. > > > > > + offset += sym->st_value; > > > + r->r_addend = offset; > > > + break; > > > case R_ARM_PC24: > > > case R_ARM_CALL: > > > case R_ARM_JUMP24: > > > -- > > > 2.39.2 > > > > > > > > > -- > > Thanks, > > ~Nick Desaulniers -- Best Regards Masahiro Yamada