[PATCH] ARM: decompressor: fix BSS size calculation for LLVM ld.lld

Ard Biesheuvel ardb at kernel.org
Sat Feb 6 08:31:56 EST 2021


On Fri, 5 Feb 2021 at 19:56, Nick Desaulniers <ndesaulniers at google.com> wrote:
>
> On Fri, Feb 5, 2021 at 10:11 AM Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > On Fri, 5 Feb 2021 at 19:00, Nick Desaulniers <ndesaulniers at google.com> wrote:
> > >
> > > On Fri, Feb 5, 2021 at 12:52 AM Ard Biesheuvel <ardb at kernel.org> wrote:
> > > >
> > > > The LLVM ld.lld linker uses a different symbol type for __bss_start,
> > > > resulting in the calculation of KBSS_SZ to be thrown off. Up until now,
> > > > this has gone unnoticed as it only affects the appended DTB case, but
> > > > pending changes for ARM in the way the decompressed kernel is cleaned
> > > > from the caches has uncovered this problem.
> > > >
> > > > On a ld.lld build:
> > > >
> > > >   $ nm vmlinux |grep bss_
> > > >   c1c22034 D __bss_start
> > > >   c1c86e98 B __bss_stop
> > > >
> > > + Fangrui,
> > > Fangrui, __bss_start looks like it's linker script defined by the
> > > BSS_SECTION macro from include/asm-generic/vmlinux.lds.h:1160 being
> > > used in arch/arm/kernel/vmlinux.lds.S:149.  Should these symbols be
> > > placed in .bss? Might save a few bytes in the image, unless there's an
> > > initial value that's encoded with them?
> > >
> >
> > Not sure what you are asking here. These symbols just delineate .bss,
> > they don't take up any space themselves.
> >
> > What seems to be happening is that the placement of __bss_start
> > outside of the .sbss/.bss section declarations causes it to be
> > annotated as residing in .data.
>
> Perhaps a misunderstanding on my part on how symbols are represented
> in ELF, but my understanding is:
>
> $ cat foo.c
> int foo;
> int bar = 0;
> int baz = 42;
> $ cc -c foo.c
> $ nm foo.o
> 0000000000000004 B bar
> 0000000000000000 D baz
> 0000000000000000 B foo
> $ ls -l foo.o
> -rw-r----- 1 ndesaulniers primarygroup 1016 Feb  5 10:47 foo.o
>
> $ cat bar.c
> int foo;
> int bar = 0;
> int baz = 0; // changed from foo.c
> $ cc -c bar.o
> $ nm bar.o
> 0000000000000004 B bar
> 0000000000000008 B baz
> 0000000000000000 B foo
> $ ls -l bar.o
> -rw-r----- 1 ndesaulniers primarygroup 1008 Feb  5 10:48 bar.o
> # ^ smaller object file
>
> That if a symbol's value was an address within .bss, then there was no
> additional space reserved in an ELF object since the initial value for
> all such symbols in the section can memset to 0 by the loader.  But if
> a symbol's value was an address in .data, that initial value must
> occupy space in the object file.  Perhaps that's not the case though
> for linker defined symbols, since I'm not sure what data/initial value
> they would correspond to besides the Elf{32|64}_Sym's value in the
> symbol table?  (I should go reread the relevant section from Expert C
> Programming: Deep C Secrets, or finish reading Linkers and Loaders).

A symbol defined in C allocates space in the binary (either in .data
or in .bss), along with a symbol to refer to it.

A linker defined symbol such as __bss_start is just a pointer into the
binary. It will alias with whichever C variable ended up at offset 0x0
in the section. Remember that clearing [__bss_start, __bss_end] in the
startup code is what initializes those variables to zero.



More information about the linux-arm-kernel mailing list