[PATCH v6 03/20] modpost: detect section mismatch for R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS

Masahiro Yamada masahiroy at kernel.org
Tue May 23 04:58:59 PDT 2023


On Tue, May 23, 2023 at 6:50 AM Ard Biesheuvel <ardb at kernel.org> wrote:
>
> On Mon, 22 May 2023 at 20:03, Nick Desaulniers <ndesaulniers at google.com> wrote:
> >
> > + linux-arm-kernel
> >
> > On Sun, May 21, 2023 at 9:05 AM Masahiro Yamada <masahiroy at kernel.org> wrote:
> > >
> > > ARM defconfig misses to detect some section mismatches.
> > >
> > >   [test code]
> > >
> > >     #include <linux/init.h>
> > >
> > >     int __initdata foo;
> > >     int get_foo(int x) { return foo; }
> > >
> > > It is apparently a bad reference, but modpost does not report anything
> > > for ARM defconfig (i.e. multi_v7_defconfig).
> > >
> > > The test code above produces the following relocations.
> > >
> > >   Relocation section '.rel.text' at offset 0x200 contains 2 entries:
> > >    Offset     Info    Type            Sym.Value  Sym. Name
> > >   00000000  0000062b R_ARM_MOVW_ABS_NC 00000000   .LANCHOR0
> > >   00000004  0000062c R_ARM_MOVT_ABS    00000000   .LANCHOR0
> > >
> > >   Relocation section '.rel.ARM.exidx' at offset 0x210 contains 2 entries:
> > >    Offset     Info    Type            Sym.Value  Sym. Name
> > >   00000000  0000022a R_ARM_PREL31      00000000   .text
> > >   00000000  00001000 R_ARM_NONE        00000000   __aeabi_unwind_cpp_pr0
> > >
> > > Currently, R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS are just skipped.
> > >
> > > Add code to handle them. I checked arch/arm/kernel/module.c to learn
> > > how the offset is encoded in the instruction.
> > >
> > > The referenced symbol in relocation might be a local anchor.
> > > If is_valid_name() returns false, let's search for a better symbol name.
> > >
> > > Signed-off-by: Masahiro Yamada <masahiroy at kernel.org>
> > > ---
> > >
> > >  scripts/mod/modpost.c | 12 ++++++++++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> > > index 34fbbd85bfde..ed2301e951a9 100644
> > > --- a/scripts/mod/modpost.c
> > > +++ b/scripts/mod/modpost.c
> > > @@ -1108,7 +1108,7 @@ static inline int is_valid_name(struct elf_info *elf, Elf_Sym *sym)
> > >  /**
> > >   * Find symbol based on relocation record info.
> > >   * In some cases the symbol supplied is a valid symbol so
> > > - * return refsym. If st_name != 0 we assume this is a valid symbol.
> > > + * return refsym. If is_valid_name() == true, we assume this is a valid symbol.
> > >   * In other cases the symbol needs to be looked up in the symbol table
> > >   * based on section and address.
> > >   *  **/
> > > @@ -1121,7 +1121,7 @@ static Elf_Sym *find_tosym(struct elf_info *elf, Elf64_Sword addr,
> > >         Elf64_Sword d;
> > >         unsigned int relsym_secindex;
> > >
> > > -       if (relsym->st_name != 0)
> > > +       if (is_valid_name(elf, relsym))
> > >                 return relsym;
> > >
> > >         /*
> > > @@ -1312,11 +1312,19 @@ static int addend_arm_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
> > >         unsigned int r_typ = ELF_R_TYPE(r->r_info);
> > >         Elf_Sym *sym = elf->symtab_start + ELF_R_SYM(r->r_info);
> > >         unsigned int inst = TO_NATIVE(*reloc_location(elf, sechdr, r));
> > > +       int offset;
> > >
> > >         switch (r_typ) {
> > >         case R_ARM_ABS32:
> > >                 r->r_addend = inst + sym->st_value;
> > >                 break;
> > > +       case R_ARM_MOVW_ABS_NC:
> > > +       case R_ARM_MOVT_ABS:
> > > +               offset = ((inst & 0xf0000) >> 4) | (inst & 0xfff);
> > > +               offset = (offset ^ 0x8000) - 0x8000;
> >
> > The code in arch/arm/kernel/module.c then right shifts the offset by
> > 16 for R_ARM_MOVT_ABS. Is that necessary?
> >
>
> MOVW/MOVT pairs are limited to an addend of -/+ 32 KiB, and the same
> value must be encoded in both instructions.


In my understanding, 'movt' loads the immediate value to
the upper 16-bit of the register.

I am just curious about the code in arch/arm/kernel/module.c.

Please see 'case R_ARM_MOVT_ABS:' part.

  [1] 'offset' is the immediate value encoded in instruction
  [2] Add sym->st_value
  [3] Right-shift 'offset' by 16
  [4] Write it back to the instruction

So, the immediate value encoded in the instruction
is divided by 65536.

I guess we need something like the following?
(left-shift by 16).

  if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS ||
      ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL)
          offset <<= 16;




>
> When constructing the actual immediate value from the symbol value and
> the addend, only the top 16 bits are used in MOVT and the bottom 16
> bits in MOVW.
>
> However, this code seems to borrow the Elf_Rela::addend field (which
> ARM does not use natively) to record the intermediate value, which
> would need to be split if it is used to fix up instruction opcodes.

At first, modpost supported only RELA for section mismatch checks.

Later, 2c1a51f39d95 ("[PATCH] kbuild: check SHT_REL sections")
added REL support.

But, the common code still used Elf_Rela.


modpost does not need to write back the fixed instruction.
modpost is only interested in the offset address.

Currently, modpost saves the offset address in
r->r_offset even for Rel. I do not like this code.

So, I am trying to reduce the use of Elf_Rela.
For example, this patch.
https://patchwork.kernel.org/project/linux-kbuild/patch/20230521160426.1881124-8-masahiroy@kernel.org/


> Btw the Thumb2 encodings of MOVT and MOVW seem to be missing here.

Right, if CONFIG_THUMB2_KERNEL=y, section mismatch check.

Several relocation types are just skipped.






>
>
> > > +               offset += sym->st_value;
> > > +               r->r_addend = offset;
> > > +               break;
> > >         case R_ARM_PC24:
> > >         case R_ARM_CALL:
> > >         case R_ARM_JUMP24:
> > > --
> > > 2.39.2
> > >
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
--
Best Regards
Masahiro Yamada



More information about the linux-arm-kernel mailing list