[PATCH 0/2] arm64: kexec_file_load vs memory reservations

Marc Zyngier maz at kernel.org
Wed Jun 2 08:59:09 PDT 2021


Hi James,

On Wed, 02 Jun 2021 15:22:00 +0100,
James Morse <james.morse at arm.com> wrote:
> 
> Hi Marc,
> 
> On 29/04/2021 14:35, Marc Zyngier wrote:
> > It recently became apparent that using kexec with kexec_file_load() on
> > arm64 is pretty similar to playing Russian roulette.
> > 
> > Depending on the amount of memory, the HW supported and the firmware
> > interface used, your secondary kernel may overwrite critical memory
> > regions without which the secondary kernel cannot boot (the GICv3 LPI
> > tables being a prime example of such reserved regions).
> > 
> > It turns out that there is at least two ways for reserved memory
> > regions to be described to kexec: /proc/iomem for the userspace
> > implementation, and memblock.reserved for kexec_file. 
> 
> One is spilled into the other by request_standard_resources()...
> 
> 
> > And of course,
> > our LPI tables are only reserved using the resource tree, leading to
> > the aforementioned stamping.
> 
> Presumably well after efi_init() has run...

Yup, much later. And we can keep on reserving memory as long as we
boot new CPUs. Having it as a one-off sync doesn't really help here.

> 
> > Similar things could happen with ACPI tables as well.
> 
> efi_init() calls reserve_regions(), which has:
> |	/* keep ACPI reclaim memory intact for kexec etc. */
> |	if (md->type == EFI_ACPI_RECLAIM_MEMORY)
> |		memblock_reserve(paddr, size);
> 
> This is also what stops mm from allocating them, as
> memblock-reserved gets copied into the PG_Reserved flag by
> free_low_memory_core_early()'s calls to reserve_bootmem_region().
> 
> Is your machines firmware putting them in a region with a different type?

Good question. Moritz (cc'd) saw the tables being overwritten on his
system (which I don't have access to), so I guess this is not entirely
clear cut how this happens.

My SQ box reports the ACPI region as "ACPI Reclaim", so I guess it
works as expected here.

> (The UEFI spec has something to say: see 2.3.6 "AArch64 Platforms":
> | ACPI Tables loaded at boot time can be contained in memory of type EfiACPIReclaimMemory
> | (recommended) or EfiACPIMemoryNVS
> 
> NVS would fail the is_usable_memory() check earlier, so gets treated
> as nomap)

Note that I've since changed tactics and proposed that we fully rely
on the resource tree instead[1].

Thanks,

	M.

[1] https://lore.kernel.org/r/20210531095720.77469-1-maz@kernel.org

-- 
Without deviation from the norm, progress is not possible.



More information about the kexec mailing list