[RFC] kexec: Use bpf to allow kexec to load PE format boot image

Pingfan Liu piliu at redhat.com
Mon Jan 13 17:28:25 PST 2025


Nowadays UEFI PE bootable image is more and more popular on the distribution.
But it is still an open issue to load that kind of image by kexec with IMA enabled

*** A brief review of the history ***
There are two categatories methods to handle this issue.
  -1. UEFI service emulator for UEFI stub
  -2. PE format parser

For the first one, I have tried a purgatory-style emulator [1]. But it
confronts the hardware scaling trouble.  For the second one, there are two
choices, one is to implement it inside the kernel, the other is inside the user
space.  Both zboot-format [2] and UKI-format [3] parsers are rejected due to
the concern that the variant format parsers will inflate the kernel code.  And
finally, we have these kinds of parsers in the user space 'kexec-tools'.


>From the beginning, it has been perceived that the user space parser can not
satisfy the requirement of security-boot without an extra embeded signature.
This issue was suspended at that time. 

But now, more and more users expect the security feature and want the
kexec_file_load to guarantee it by IMA.  I tried to fix that issue by the extra
embeded signature method. But it is also disliked.

Enlighted by Philipp suggestion about implementing systemd-stub in bpf opcode in the discussion to [1],
I turn to the bpf and hope that parsers in bpf-program can resolve this issue. 

[1]: https://lore.kernel.org/lkml/20240819145417.23367-1-piliu@redhat.com/T/
[2]: https://lore.kernel.org/kexec/20230306030305.15595-1-kernelfans@gmail.com/
[3]: https://lore.kernel.org/lkml/20230911052535.335770-1-kernel@jfarr.cc/
[4]: https://lore.kernel.org/linux-arm-kernel/20230921133703.39042-2-kernelfans@gmail.com/T/




*** Reflect the problem and a new proposal ***

The UEFI emulator is anchored at the UEFI spec. That will incur lots of work
due to various hardware support.  For example, to support TPM, the emulator
should implement PCI/I2C bus protocol.

But if the problem is confined to the original linux kernel boot protocol, it will be simple.
Only three things should be considered: the kernel image, the initrd and the command line.
If we can get them in a security way, we can tackle the problem.

The integrity of the file is ensured under the protection of the signature
envelope.  If the kexeced files are parsed in the user space, the envelopes are
opened and invalid.  So they should sink into the kernel space, be verified and
be manipulated there.  And to manipulate the various format file, we need
bpf-program, which know their format.

There are three parties in this solution
-1. The kexec-tools itself is protected by IMA, and it creates a bpf-map and
update UKI addon file names into the map. Later, the bpf-program will call
bpf-helper to pad these files into initrd

-2. The bpf-program is contained in a dedicated '.bpf' section in PE file. When
kexec_file_load a PE image, it extract the '.bpf' section and reflect it to the
user space through procfs. And kexec-tools starts the program.  By this way,
the bpf-program itself is free from tampering. 

The bpf-program completes two things:
	-1.parse the image format
	-2.call bpf kexec helpers to manipulate signed files

-3. The bpf helpers. There will be three helpers introduced.
The first one for the data exchange between the bpf-program and the kernel.
The second one for the decompressor.
The third one for the manipulation of the cpio



***  Overview of the design in Pseudocode ***


ThreadA: kexec thread which invokes kexec_file_load
ThreadB: the dedicated thread in kexec-tools to load bpf-prog
------
Diag 1. the interaction between bpf-prog loader and its executer


ThreadA						ThreadB

						wait on eventfd_A


expose bpf-prog through procfs
& signal eventfd_A
& wait on eventfd_B

						read the bpf-prog from procfs
						& initialize the bpf and install it to the fentry
						& signal eventfd_B
						& wait on eventfd_A again
						
fentry executes bpf-prog to parse image
& generate output for the next stop


-------------------
Diag 2. bpf-prog

SEC("fentry/kexec_pe_parser_hook")
int BPF_PROG(pe_parser, struct kimage *image, ...)
{

	buf = bpf_ringbuf_reserve(rb, size);
	buf_result = bpf_ringbuf_reserve(rb, res_sz);
	/* Ask kernel to copy the resource content to here */
	bpf_helper_carrier(resource_name, buf, size, in);
	
	/* Parse the format laying on buf */
	...
	/* call extra bpf-helpers */
	...
	
	/* Ask kernel to copy the resource content from here */
	bpf_helper_carrier(resource_name, buf_result, res_sz, out);

}

At present, bpf map functions provides the mechanism to exchange the data between the user space and bpf-prog.
But for bpf-prog and the kernel, there is no good choice. So I introduce a bpf helper function
	bpf_helper_carrier(resource_name, buf, size, in)

The above code implements the data exchange between the kernel and bpf-prog.
By this way, the data parsing process is not exposed to the user space any longer.



extra bpf-helpers:

	/* Decompress the compressed kernel image */
	bpf_helper_decompress(src, src_size, dst, dst_sz)
	
	/* 
	 * Verify the signature of @addon_filename, padding it to initrd's dir @dst_dir
	 */
	bpf_helper_supplement_initrd(dst_dir, addon_filename)

	Note: Due to the UEFI environment (such as edk2) only providing basic
        file operations for FAT filesystems, any UEFI-stub PE image (like systemd-stub)
        is restricted to these basic operation services.  As a result, the
        functionality of such bpf-kexec helpers is inherently limited.
	

*** Thoughts about the basic operation *** 

The basic operations have influence on the stability of bpf-kexec-helpers.

The kexec_file_load faces three kinds of elements: linux-kernel, initrd and cmdline.

For the kernel, on arm64 or riscv, in order to get the bootable image from the compressed data,
there should be a bpf-helper function as a wrapper of __decompress()

For initrd, systemd-sysext may require padding extra file into initrd

For cmdline, it may require some string trim or conjoin.

Overall, these user requirements are foreseeable and straightforward,
suggesting that bpf-kexec-helpers will likely remain stable without significant
changes.


Cc: Alexei Starovoitov <ast at kernel.org>
Cc: Daniel Borkmann <daniel at iogearbox.net>
Cc: John Fastabend <john.fastabend at gmail.com>
Cc: Jeremy Linton <jeremy.linton at arm.com>
Cc: Catalin Marinas <catalin.marinas at arm.com>
Cc: Will Deacon <will at kernel.org>
Cc: Mark Rutland <mark.rutland at arm.com>
Cc: Simon Horman <horms at kernel.org>
Cc: Gerd Hoffmann <kraxel at redhat.com>
Cc: Vitaly Kuznetsov <vkuznets at redhat.com>
Cc: Philipp Rudo <prudo at redhat.com>
Cc: Jan Hendrik Farr <kernel at jfarr.cc>
Cc: Baoquan He <bhe at redhat.com>
Cc: Dave Young <dyoung at redhat.com>
Cc: Eric Biederman <ebiederm at xmission.com>
Cc: Pingfan Liu <piliu at redhat.com>
To: kexec at lists.infradead.org
To: bpf at vger.kernel.org




More information about the kexec mailing list