[PATCH 0/2] arm64: kexec_file_load vs memory reservations
maz at kernel.org
Wed Jun 2 08:59:09 PDT 2021
On Wed, 02 Jun 2021 15:22:00 +0100,
James Morse <james.morse at arm.com> wrote:
> Hi Marc,
> On 29/04/2021 14:35, Marc Zyngier wrote:
> > It recently became apparent that using kexec with kexec_file_load() on
> > arm64 is pretty similar to playing Russian roulette.
> > Depending on the amount of memory, the HW supported and the firmware
> > interface used, your secondary kernel may overwrite critical memory
> > regions without which the secondary kernel cannot boot (the GICv3 LPI
> > tables being a prime example of such reserved regions).
> > It turns out that there is at least two ways for reserved memory
> > regions to be described to kexec: /proc/iomem for the userspace
> > implementation, and memblock.reserved for kexec_file.
> One is spilled into the other by request_standard_resources()...
> > And of course,
> > our LPI tables are only reserved using the resource tree, leading to
> > the aforementioned stamping.
> Presumably well after efi_init() has run...
Yup, much later. And we can keep on reserving memory as long as we
boot new CPUs. Having it as a one-off sync doesn't really help here.
> > Similar things could happen with ACPI tables as well.
> efi_init() calls reserve_regions(), which has:
> | /* keep ACPI reclaim memory intact for kexec etc. */
> | if (md->type == EFI_ACPI_RECLAIM_MEMORY)
> | memblock_reserve(paddr, size);
> This is also what stops mm from allocating them, as
> memblock-reserved gets copied into the PG_Reserved flag by
> free_low_memory_core_early()'s calls to reserve_bootmem_region().
> Is your machines firmware putting them in a region with a different type?
Good question. Moritz (cc'd) saw the tables being overwritten on his
system (which I don't have access to), so I guess this is not entirely
clear cut how this happens.
My SQ box reports the ACPI region as "ACPI Reclaim", so I guess it
works as expected here.
> (The UEFI spec has something to say: see 2.3.6 "AArch64 Platforms":
> | ACPI Tables loaded at boot time can be contained in memory of type EfiACPIReclaimMemory
> | (recommended) or EfiACPIMemoryNVS
> NVS would fail the is_usable_memory() check earlier, so gets treated
> as nomap)
Note that I've since changed tactics and proposed that we fully rely
on the resource tree instead.
Without deviation from the norm, progress is not possible.
More information about the kexec