EFI table being corrupted during Kexec
Eric W. Biederman
ebiederm at xmission.com
Tue Sep 10 07:26:00 PDT 2024
Breno Leitao <leitao at debian.org> writes:
> We've seen a problem in upstream kernel kexec, where a EFI TPM log event table
> is being overwritten. This problem happen on real machine, as well as in a
> recent EDK2 qemu VM.
>
> Digging deep, the table is being overwritten during kexec, more precisely when
> relocating kernel (relocate_kernel() function).
>
> I've also found that the table is being properly reserved using
> memblock_reserve() early in the boot, and that range gets overwritten later in
> by relocate_kernel(). In other words, kexec is overwriting a memory that was
> previously reserved (as memblock_reserve()).
>
> Usama found that kexec only honours memory reservations from /sys/firmware/memmap
> which comes from e820_table_firmware table.
>
> Looking at the TPM spec, I found the following part:
>
> If the ACPI TPM2 table contains the address and size of the Platform Firmware TCG log,
> firmware “pins” the memory associated with the Platform Firmware TCG log, and reports
> this memory as “Reserved” memory via the INT 15h/E820 interface.
>
>
> From: https://trustedcomputinggroup.org/wp-content/uploads/PC-ClientPlatform_Profile_for_TPM_2p0_Systems_v49_161114_public-review.pdf
>
> I am wondering if that memory region/range should be part of e820 table that is
> passed by EFI firmware to kernel, and if it is not passed (as it is not being
> passed today), then the kernel doesn't need to respect it, and it is free to
> overwrite (as it does today). In other words, this is a firmware bug and not a
> kernel bug.
>
> Am I missing something?
I agree that this appears to be a firmware bug. This memory is reserved
in one location and not in another location.
That said that doesn't mean we can't deal with it in the kernel.
acpi_table_upgrade seems to have hit a similar issue issue and calls
arch_reserve_mem_area to reserve the area in the e820tables.
The last time I looked the e820 tables (in the kernel) are used to store
the efi memory map when available and only use the true e820 data on
older systems.
Which is a long way of say that the e820 table in the kernel last I
looked was the master table, of how the firmware views the memory.
As I recall the memblock allocator is the bootstrap memory allocator
used when bringing up the kernel. So I don't see reserving something
in the memblock allocator as being authoritative as to how the firmware
has setup memory.
I would suggest writing a patch to update whatever is calling
memblock_reserve to also, or perhaps in preference to update the e820
map. If the code is not x86 specific I would suggest using ACPI's
arch_reserve_mem_area call.
If you have a good path to your the folks who write for the computers
where this happens it seems entirely reasonable to report this as a bug
to them as well.
Eric
More information about the kexec
mailing list