[PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
Fangrui Song
maskray at google.com
Wed Mar 10 22:29:21 GMT 2021
On 2021-03-10, Arnd Bergmann wrote:
>On Wed, Mar 10, 2021 at 9:50 PM Masahiro Yamada <masahiroy at kernel.org> wrote:
>> On Mon, Mar 1, 2021 at 10:11 AM Nicholas Piggin <npiggin at gmail.com> wrote:
>> > Excerpts from Arnd Bergmann's message of February 27, 2021 7:49 pm:
>
>>
>> masahiro at oscar:~/ref/linux$ echo 'void this_func_is_unused(void) {}'
>> >> kernel/cpu.c
>> masahiro at oscar:~/ref/linux$ export
>> CROSS_COMPILE=/home/masahiro/tools/powerpc-10.1.0/bin/powerpc-linux-
>> masahiro at oscar:~/ref/linux$ make ARCH=powerpc defconfig
>> masahiro at oscar:~/ref/linux$ ./scripts/config -e EXPERT
>> masahiro at oscar:~/ref/linux$ ./scripts/config -e LD_DEAD_CODE_DATA_ELIMINATION
>> masahiro at oscar:~/ref/linux$
>> ~/tools/powerpc-10.1.0/bin/powerpc-linux-nm -n vmlinux | grep
>> this_func
>> c000000000170560 T .this_func_is_unused
>> c000000001d8d560 D this_func_is_unused
>> masahiro at oscar:~/ref/linux$ grep DEAD_CODE_ .config
>> CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y
>> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
>>
>>
>> If I remember correctly,
>> LD_DEAD_CODE_DATA_ELIMINATION dropped unused functions
>> when I tried it last time.
--gc-sections drops unused sections.
If the unused function is part of a larger section which is retained due to other symbols (-fno-function-sections),
the unused section will be retained as well.
>>
>>
>> I also tried arm64 with a HAVE_LD_DEAD_CODE_DATA_ELIMINATION hack.
>> The result was the same.
>>
>>
>>
>> Am I missing something?
>
>It's possible that it only works in combination with CLANG_LTO now
>because something broke. I definitely saw a reduction in kernel
>size when both options are enabled, but did not try a simple test
>case like you did.
>
>Maybe some other reference gets created that prevents the function
>from being garbage-collected unless that other option is removed
>as well?
>
> Arnd
I believe with LLVM regular LTO, --gc-sections has very little benefit
on compiler generated sections. It is still useful for assembly generated sections
(but most such sections are probably needed):
* Target specific optimizations can drop references on constants (e.g. `memcpy(..., &constant, sizeof(constant));`)
* Due to phase ordering issues some definitions are not discarded by the optimizer.
For ThinLTO there are more compiler generated sections discarded by `--gc-sections`:
* ThinLTO can cause a definition to be imported to other modules. The original definition may be unneeded after imports.
* The definition may survive after intra-module optimization. After imports, a round of (inter-module) IR optimizations after `computeDeadSymbolsWithConstProp` may make the definition unneeded.
* Symbol resolution is conservative.
Regarding symbol resolution, symbol resolution happens before LTO and LTO happens before --gc-sections. The symbol resolution process may be conservative: it may communicate to LTO that some symbols are referenced by regular object files while in the GC stage the references turn out to not exist because of discarded sections with more precise GC roots.
(I've added the above points to my https://maskray.me/blog/2021-02-28-linker-garbage-collection#link-time-optimization )
More information about the linux-arm-kernel
mailing list