[RFC PATCH 2/2] arm64: kernel: switch to PIE code generation for relocatable kernels

Fangrui Song maskray at google.com
Wed Apr 27 19:40:30 PDT 2022


On 2022-04-27, Ard Biesheuvel wrote:
>We currently use ordinary, position dependent code generation for the
>core kernel, which happens to default to the 'small' code model on both
>GCC and Clang. This is the code model that relies on ADRP/ADD or
>ADRP/LDR pairs for symbol references, which are PC-relative with a range
>of -/+ 4 GiB, and therefore happen to be position independent in
>practice.
>
>This means that the fact that we can link the relocatable KASLR kernel
>using the -pie linker flag (which generates the runtime relocations and
>inserts them into the binary) is somewhat of a coincidence, and not
>something which is explicitly supported by the toolchains.

Agree. The current -fno-PIE + -shared -Bsymbolic combo works as a
conincidence, not guaranteed by the toolchain.

-shared needs -fpic object files. -shared -Bsymbolic is very similar to
-pie and therefore works with -fpie object files, but the usage is not
recommended from the toolchain perspective.

>The reason we have not used -fpie for code generation so far (which is
>the compiler flag that should be used to generate code that is to be
>linked with -pie) is that by default, it generates code based on
>assumptions that only hold for shared libraries and PIE executables,
>i.e., that gathering all relocatable quantities into a Global Offset
>Table (GOT) is desirable because it reduces the CoW footprint, and
>because it permits ELF symbol preemption (which lets an executable
>override symbols defined in a shared library, in a way that forces the
>shared library to update all of its internal references as well).
>Ironically, this means we end up with many more absolute references that
>all need to be fixed up at boot.

This is not about symbol preemption (when the executable and a shared
objectdefine the same symbol, which one wins). An executable using a GOT
which will be resolved to a shared object => this is regular relocation
resolving and there is no preemption.

It is that the compiler prefers code generation which can avoid text
relocations / copy relocations / canonical PLT entries
(https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected#summary).

>Fortunately, we can convince the compiler to handle this in a way that
>is a bit more suitable for freestanding binaries such as the kernel, by
>setting the 'hidden' visibility #pragma, which informs the compiler that
>symbol preemption or CoW footprint are of no concern to us, and so
>PC-relative references that are resolved at link time are perfectly
>fine.

Agree

>So let's enable this #pragma and build with -fpie when building a
>relocatable kernel. This also means that all constant data items that
>carry statically initialized pointer variables are now emitted into the
>.data.rel.ro* sections, so move these into .rodata where they belong.

LGTM, except: is ".rodata" a typo? The patch doesn't reference .rodata

>Code size impact (GCC):
>
>Before:
>
>      text       data        bss      total filename
>  16712396   18659064     534556   35906016 vmlinux
>
>After:
>
>      text       data        bss      total filename
>  16804400   18612876     534556   35951832 vmlinux
>
>Code size impact (Clang):
>
>Before:
>
>      text       data        bss      total filename
>  17194584   13335060     535268   31064912 vmlinux
>
>After:
>
>      text       data        bss      total filename
>  17194536   13310032     535268   31039836 vmlinux
>
>Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
>---
> arch/arm64/Makefile             | 4 ++++
> arch/arm64/kernel/vmlinux.lds.S | 9 ++++-----
> 2 files changed, 8 insertions(+), 5 deletions(-)
>
>diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
>index 2f1de88651e6..94b6c51f5de6 100644
>--- a/arch/arm64/Makefile
>+++ b/arch/arm64/Makefile
>@@ -18,6 +18,10 @@ ifeq ($(CONFIG_RELOCATABLE), y)
> # with the relocation offsets always being zero.
> LDFLAGS_vmlinux		+= -shared -Bsymbolic -z notext \
> 			$(call ld-option, --no-apply-dynamic-relocs)
>+
>+# Generate position independent code without relying on a Global Offset Table
>+KBUILD_CFLAGS_KERNEL   += -fpie -include $(srctree)/include/linux/hidden.h
>+
> endif
>
> ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)
>diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>index edaf0faf766f..b1e071ac1acf 100644
>--- a/arch/arm64/kernel/vmlinux.lds.S
>+++ b/arch/arm64/kernel/vmlinux.lds.S
>@@ -174,8 +174,6 @@ SECTIONS
> 			KEXEC_TEXT
> 			TRAMP_TEXT
> 			*(.gnu.warning)
>-		. = ALIGN(16);
>-		*(.got)			/* Global offset table		*/
> 	}
>
> 	/*
>@@ -192,6 +190,8 @@ SECTIONS
> 	/* everything from this point to __init_begin will be marked RO NX */
> 	RO_DATA(PAGE_SIZE)
>
>+	.data.rel.ro : ALIGN(8) { *(.got) *(.data.rel.ro*) }
>+
> 	HYPERVISOR_DATA_SECTIONS
>
> 	idmap_pg_dir = .;
>@@ -273,6 +273,8 @@ SECTIONS
> 	_sdata = .;
> 	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
>
>+	.data.rel : ALIGN(8) { *(.data.rel*) }
>+
> 	/*
> 	 * Data written with the MMU off but read with the MMU on requires
> 	 * cache lines to be invalidated, discarding up to a Cache Writeback
>@@ -320,9 +322,6 @@ SECTIONS
> 		*(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt)
> 	}
> 	ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
>-
>-	.data.rel.ro : { *(.data.rel.ro) }
>-	ASSERT(SIZEOF(.data.rel.ro) == 0, "Unexpected RELRO detected!")
> }
>
> #include "image-vars.h"
>-- 
>2.30.2
>
>-- 
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe at googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20220427171241.2426592-3-ardb%40kernel.org.



More information about the linux-arm-kernel mailing list