[PATCH v2 00/28] Add support for Clang LTO

Masahiro Yamada masahiroy at kernel.org
Sat Sep 5 20:24:38 EDT 2020


On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen at google.com> wrote:
>
> This patch series adds support for building x86_64 and arm64 kernels
> with Clang's Link Time Optimization (LTO).
>
> In addition to performance, the primary motivation for LTO is
> to allow Clang's Control-Flow Integrity (CFI) to be used in the
> kernel. Google has shipped millions of Pixel devices running three
> major kernel versions with LTO+CFI since 2018.
>
> Most of the patches are build system changes for handling LLVM
> bitcode, which Clang produces with LTO instead of ELF object files,
> postponing ELF processing until a later stage, and ensuring initcall
> ordering.
>
> Note that patches 1-4 are not directly related to LTO, but are
> needed to compile LTO kernels with ToT Clang, so I'm including them
> in the series for your convenience:
>
>  - Patches 1-3 are required for building the kernel with ToT Clang,
>    and IAS, and patch 4 is needed to build allmodconfig with LTO.
>
>  - Patches 3-4 are already in linux-next, but not yet in 5.9-rc.
>


I still do not understand how this patch set works.
(only me?)

Please let me ask fundamental questions.



I applied this series on top of Linus' tree,
and compiled for ARCH=arm64.

I compared the kernel size with/without LTO.



[1] No LTO  (arm64 defconfig, CONFIG_LTO_NONE)

$ llvm-size   vmlinux
   text    data     bss     dec     hex filename
15848692 10099449 493060 26441201 19375f1 vmlinux



[2] Clang LTO  (arm64 defconfig + CONFIG_LTO_CLANG)

$ llvm-size   vmlinux
   text    data     bss     dec     hex filename
15906864 10197445 490804 26595113 195cf29 vmlinux


I compared the size of raw binary, arch/arm64/boot/Image.
Its size increased too.



So, in my experiment, enabling CONFIG_LTO_CLANG
increases the kernel size.
Is this correct?


One more thing, could you teach me
how Clang LTO optimizes the code against
relocatable objects?



When I learned Clang LTO first, I read this document:
https://llvm.org/docs/LinkTimeOptimization.html

It is easy to confirm the final executable
does not contain foo2, foo3...



In contrast to userspace programs,
kernel modules are basically relocatable objects.

Does Clang drop unused symbols from relocatable objects?
If so, how?

I implemented an example module (see the attachment),
and checked the symbols.
Nothing was dropped.

The situation is the same for build-in
because LTO is run against vmlinux.o, which is
relocatable as well.


--
Best Regards

Masahiro Yamada
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-lto-test-module.patch
Type: application/x-patch
Size: 3428 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20200906/5e484947/attachment.bin>


More information about the linux-arm-kernel mailing list