On Tue, May 23, 2023 at 9:21 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > On Tue, 23 May 2023 at 13:59, Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote: > > > > On Tue, May 23, 2023 at 6:50 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > On Mon, 22 May 2023 at 20:03, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: > > > > > > > > + linux-arm-kernel > > > > > > > > On Sun, May 21, 2023 at 9:05 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote: > > > > > > > > > > ARM defconfig misses to detect some section mismatches. > > > > > > > > > > [test code] > > > > > > > > > > #include <linux/init.h> > > > > > > > > > > int __initdata foo; > > > > > int get_foo(int x) { return foo; } > > > > > > > > > > It is apparently a bad reference, but modpost does not report anything > > > > > for ARM defconfig (i.e. multi_v7_defconfig). > > > > > > > > > > The test code above produces the following relocations. > > > > > > > > > > Relocation section '.rel.text' at offset 0x200 contains 2 entries: > > > > > Offset Info Type Sym.Value Sym. Name > > > > > 00000000 0000062b R_ARM_MOVW_ABS_NC 00000000 .LANCHOR0 > > > > > 00000004 0000062c R_ARM_MOVT_ABS 00000000 .LANCHOR0 > > > > > > > > > > Relocation section '.rel.ARM.exidx' at offset 0x210 contains 2 entries: > > > > > Offset Info Type Sym.Value Sym. Name > > > > > 00000000 0000022a R_ARM_PREL31 00000000 .text > > > > > 00000000 00001000 R_ARM_NONE 00000000 __aeabi_unwind_cpp_pr0 > > > > > > > > > > Currently, R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS are just skipped. > > > > > > > > > > Add code to handle them. I checked arch/arm/kernel/module.c to learn > > > > > how the offset is encoded in the instruction. > > > > > > > > > > The referenced symbol in relocation might be a local anchor. > > > > > If is_valid_name() returns false, let's search for a better symbol name. > > > > > > > > > > Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx> > > > > > --- > > > > > > > > > > scripts/mod/modpost.c | 12 ++++++++++-- > > > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c > > > > > index 34fbbd85bfde..ed2301e951a9 100644 > > > > > --- a/scripts/mod/modpost.c > > > > > +++ b/scripts/mod/modpost.c > > > > > @@ -1108,7 +1108,7 @@ static inline int is_valid_name(struct elf_info *elf, Elf_Sym *sym) > > > > > /** > > > > > * Find symbol based on relocation record info. > > > > > * In some cases the symbol supplied is a valid symbol so > > > > > - * return refsym. If st_name != 0 we assume this is a valid symbol. > > > > > + * return refsym. If is_valid_name() == true, we assume this is a valid symbol. > > > > > * In other cases the symbol needs to be looked up in the symbol table > > > > > * based on section and address. > > > > > * **/ > > > > > @@ -1121,7 +1121,7 @@ static Elf_Sym *find_tosym(struct elf_info *elf, Elf64_Sword addr, > > > > > Elf64_Sword d; > > > > > unsigned int relsym_secindex; > > > > > > > > > > - if (relsym->st_name != 0) > > > > > + if (is_valid_name(elf, relsym)) > > > > > return relsym; > > > > > > > > > > /* > > > > > @@ -1312,11 +1312,19 @@ static int addend_arm_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r) > > > > > unsigned int r_typ = ELF_R_TYPE(r->r_info); > > > > > Elf_Sym *sym = elf->symtab_start + ELF_R_SYM(r->r_info); > > > > > unsigned int inst = TO_NATIVE(*reloc_location(elf, sechdr, r)); > > > > > + int offset; > > > > > > > > > > switch (r_typ) { > > > > > case R_ARM_ABS32: > > > > > r->r_addend = inst + sym->st_value; > > > > > break; > > > > > + case R_ARM_MOVW_ABS_NC: > > > > > + case R_ARM_MOVT_ABS: > > > > > + offset = ((inst & 0xf0000) >> 4) | (inst & 0xfff); > > > > > + offset = (offset ^ 0x8000) - 0x8000; > > > > > > > > The code in arch/arm/kernel/module.c then right shifts the offset by > > > > 16 for R_ARM_MOVT_ABS. Is that necessary? > > > > > > > > > > MOVW/MOVT pairs are limited to an addend of -/+ 32 KiB, and the same > > > value must be encoded in both instructions. > > > > > > In my understanding, 'movt' loads the immediate value to > > the upper 16-bit of the register. > > > > Correct. It sets the upper 16 bits of a register without corrupting > the lower 16 bits. > > > I am just curious about the code in arch/arm/kernel/module.c. > > > > Please see 'case R_ARM_MOVT_ABS:' part. > > > > [1] 'offset' is the immediate value encoded in instruction > > [2] Add sym->st_value > > [3] Right-shift 'offset' by 16 > > [4] Write it back to the instruction > > > > So, the immediate value encoded in the instruction > > is divided by 65536. > > > > I guess we need something like the following? > > (left-shift by 16). > > > > if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS || > > ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL) > > offset <<= 16; > > > > No. The addend is not encoded in the same way as the effective immediate value. > > The addend is limited to -/+ 32 KiB (range of s16), and the MOVT > instruction must use the same addend value as the MOVW instruction it > is paired with, without shifting. > > This is necessary because otherwise, there is no way to handle an > addend/symbol combination that results in a carry between the lower > and upper 16 bit words. This is a consequence of the use of REL format > rather than RELA, where the addend is part of the relocation and not > encoded in the instructions. Ah, OK. Now I understand. > > > > > > > > > > > > When constructing the actual immediate value from the symbol value and > > > the addend, only the top 16 bits are used in MOVT and the bottom 16 > > > bits in MOVW. > > > > > > However, this code seems to borrow the Elf_Rela::addend field (which > > > ARM does not use natively) to record the intermediate value, which > > > would need to be split if it is used to fix up instruction opcodes. > > > > At first, modpost supported only RELA for section mismatch checks. > > > > Later, 2c1a51f39d95 ("[PATCH] kbuild: check SHT_REL sections") > > added REL support. > > > > But, the common code still used Elf_Rela. > > > > > > modpost does not need to write back the fixed instruction. > > modpost is only interested in the offset address. > > > > Currently, modpost saves the offset address in > > r->r_offset even for Rel. I do not like this code. > > > > So, I am trying to reduce the use of Elf_Rela. > > For example, this patch. > > https://patchwork.kernel.org/project/linux-kbuild/patch/20230521160426.1881124-8-masahiroy@xxxxxxxxxx/ > > > > Yeah, that looks better to me. > > > > > > Btw the Thumb2 encodings of MOVT and MOVW seem to be missing here. > > > > Right, if CONFIG_THUMB2_KERNEL=y, section mismatch check. > > > > Several relocation types are just skipped. > > > > Skipped entirely? Or only for the diagnostic print that outputs the symbol name? Skipped entirely. modpost cannot detect section mismatches if you enable CONFIG_THUMB2_KERNEL. -- Best Regards Masahiro Yamada