Re: [PATCH v6 03/20] modpost: detect section mismatch for R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 23, 2023 at 9:21 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Tue, 23 May 2023 at 13:59, Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
> >
> > On Tue, May 23, 2023 at 6:50 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> > >
> > > On Mon, 22 May 2023 at 20:03, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
> > > >
> > > > + linux-arm-kernel
> > > >
> > > > On Sun, May 21, 2023 at 9:05 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
> > > > >
> > > > > ARM defconfig misses to detect some section mismatches.
> > > > >
> > > > >   [test code]
> > > > >
> > > > >     #include <linux/init.h>
> > > > >
> > > > >     int __initdata foo;
> > > > >     int get_foo(int x) { return foo; }
> > > > >
> > > > > It is apparently a bad reference, but modpost does not report anything
> > > > > for ARM defconfig (i.e. multi_v7_defconfig).
> > > > >
> > > > > The test code above produces the following relocations.
> > > > >
> > > > >   Relocation section '.rel.text' at offset 0x200 contains 2 entries:
> > > > >    Offset     Info    Type            Sym.Value  Sym. Name
> > > > >   00000000  0000062b R_ARM_MOVW_ABS_NC 00000000   .LANCHOR0
> > > > >   00000004  0000062c R_ARM_MOVT_ABS    00000000   .LANCHOR0
> > > > >
> > > > >   Relocation section '.rel.ARM.exidx' at offset 0x210 contains 2 entries:
> > > > >    Offset     Info    Type            Sym.Value  Sym. Name
> > > > >   00000000  0000022a R_ARM_PREL31      00000000   .text
> > > > >   00000000  00001000 R_ARM_NONE        00000000   __aeabi_unwind_cpp_pr0
> > > > >
> > > > > Currently, R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS are just skipped.
> > > > >
> > > > > Add code to handle them. I checked arch/arm/kernel/module.c to learn
> > > > > how the offset is encoded in the instruction.
> > > > >
> > > > > The referenced symbol in relocation might be a local anchor.
> > > > > If is_valid_name() returns false, let's search for a better symbol name.
> > > > >
> > > > > Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
> > > > > ---
> > > > >
> > > > >  scripts/mod/modpost.c | 12 ++++++++++--
> > > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> > > > > index 34fbbd85bfde..ed2301e951a9 100644
> > > > > --- a/scripts/mod/modpost.c
> > > > > +++ b/scripts/mod/modpost.c
> > > > > @@ -1108,7 +1108,7 @@ static inline int is_valid_name(struct elf_info *elf, Elf_Sym *sym)
> > > > >  /**
> > > > >   * Find symbol based on relocation record info.
> > > > >   * In some cases the symbol supplied is a valid symbol so
> > > > > - * return refsym. If st_name != 0 we assume this is a valid symbol.
> > > > > + * return refsym. If is_valid_name() == true, we assume this is a valid symbol.
> > > > >   * In other cases the symbol needs to be looked up in the symbol table
> > > > >   * based on section and address.
> > > > >   *  **/
> > > > > @@ -1121,7 +1121,7 @@ static Elf_Sym *find_tosym(struct elf_info *elf, Elf64_Sword addr,
> > > > >         Elf64_Sword d;
> > > > >         unsigned int relsym_secindex;
> > > > >
> > > > > -       if (relsym->st_name != 0)
> > > > > +       if (is_valid_name(elf, relsym))
> > > > >                 return relsym;
> > > > >
> > > > >         /*
> > > > > @@ -1312,11 +1312,19 @@ static int addend_arm_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
> > > > >         unsigned int r_typ = ELF_R_TYPE(r->r_info);
> > > > >         Elf_Sym *sym = elf->symtab_start + ELF_R_SYM(r->r_info);
> > > > >         unsigned int inst = TO_NATIVE(*reloc_location(elf, sechdr, r));
> > > > > +       int offset;
> > > > >
> > > > >         switch (r_typ) {
> > > > >         case R_ARM_ABS32:
> > > > >                 r->r_addend = inst + sym->st_value;
> > > > >                 break;
> > > > > +       case R_ARM_MOVW_ABS_NC:
> > > > > +       case R_ARM_MOVT_ABS:
> > > > > +               offset = ((inst & 0xf0000) >> 4) | (inst & 0xfff);
> > > > > +               offset = (offset ^ 0x8000) - 0x8000;
> > > >
> > > > The code in arch/arm/kernel/module.c then right shifts the offset by
> > > > 16 for R_ARM_MOVT_ABS. Is that necessary?
> > > >
> > >
> > > MOVW/MOVT pairs are limited to an addend of -/+ 32 KiB, and the same
> > > value must be encoded in both instructions.
> >
> >
> > In my understanding, 'movt' loads the immediate value to
> > the upper 16-bit of the register.
> >
>
> Correct. It sets the upper 16 bits of a register without corrupting
> the lower 16 bits.
>
> > I am just curious about the code in arch/arm/kernel/module.c.
> >
> > Please see 'case R_ARM_MOVT_ABS:' part.
> >
> >   [1] 'offset' is the immediate value encoded in instruction
> >   [2] Add sym->st_value
> >   [3] Right-shift 'offset' by 16
> >   [4] Write it back to the instruction
> >
> > So, the immediate value encoded in the instruction
> > is divided by 65536.
> >
> > I guess we need something like the following?
> > (left-shift by 16).
> >
> >   if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS ||
> >       ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL)
> >           offset <<= 16;
> >
>
> No. The addend is not encoded in the same way as the effective immediate value.
>
> The addend is limited to -/+ 32 KiB (range of s16), and the MOVT
> instruction must use the same addend value as the MOVW instruction it
> is paired with, without shifting.
>
> This is necessary because otherwise, there is no way to handle an
> addend/symbol combination that results in a carry between the lower
> and upper 16 bit words. This is a consequence of the use of REL format
> rather than RELA, where the addend is part of the relocation and not
> encoded in the instructions.


Ah, OK.
Now I understand.




> >
> >
> >
> > >
> > > When constructing the actual immediate value from the symbol value and
> > > the addend, only the top 16 bits are used in MOVT and the bottom 16
> > > bits in MOVW.
> > >
> > > However, this code seems to borrow the Elf_Rela::addend field (which
> > > ARM does not use natively) to record the intermediate value, which
> > > would need to be split if it is used to fix up instruction opcodes.
> >
> > At first, modpost supported only RELA for section mismatch checks.
> >
> > Later, 2c1a51f39d95 ("[PATCH] kbuild: check SHT_REL sections")
> > added REL support.
> >
> > But, the common code still used Elf_Rela.
> >
> >
> > modpost does not need to write back the fixed instruction.
> > modpost is only interested in the offset address.
> >
> > Currently, modpost saves the offset address in
> > r->r_offset even for Rel. I do not like this code.
> >
> > So, I am trying to reduce the use of Elf_Rela.
> > For example, this patch.
> > https://patchwork.kernel.org/project/linux-kbuild/patch/20230521160426.1881124-8-masahiroy@xxxxxxxxxx/
> >
>
> Yeah, that looks better to me.
>
> >
> > > Btw the Thumb2 encodings of MOVT and MOVW seem to be missing here.
> >
> > Right, if CONFIG_THUMB2_KERNEL=y, section mismatch check.
> >
> > Several relocation types are just skipped.
> >
>
> Skipped entirely? Or only for the diagnostic print that outputs the symbol name?


Skipped entirely.

modpost cannot detect section mismatches
if you enable CONFIG_THUMB2_KERNEL.



--
Best Regards
Masahiro Yamada




[Index of Archives]     [Linux&nblp;USB Development]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite Secrets]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux