[PATCH v2 1/2] kexec: Add KEXEC_FILE_NO_CMA as a legal flag
Baoquan He
bhe at redhat.com
Mon Sep 8 23:33:52 PDT 2025
On 09/04/25 at 07:58pm, Brian Mak wrote:
> On Aug 25, 2025, at 11:49 AM, Brian Mak <makb at juniper.net> wrote:
>
> > On Aug 21, 2025, at 8:33 PM, Baoquan He <bhe at redhat.com> wrote:
> >
> >> Yeah, this is a good catch and great fix. Without this fix,
> >> kexec_file_load syscall will failed and return '-EINVAL' when
> >> KEXEC_FILE_NO_CMA is specified just as below code shows. So, for this
> >> patch,
> >>
> >> Acked-by: Baoquan He <bhe at redhat.com>
> >
> > Hi Baoquan,
> >
> > Thanks for the ACK!
> >
> >> And, by the way, has the user space kexec-tools got the change merged
> >> to allow KEXEC_FILE_NO_CMA specified?
> >
> > I don't see any recent commits to kexec-tools to support
> > KEXEC_FILE_NO_CMA.
> >
> >>> From: Brian Mak <makb at juniper.net>
> >>> Subject: x86/kexec: carry forward the boot DTB on kexec
> >>> Date: Tue, 5 Aug 2025 14:15:27 -0700
> >>>
> >>> Currently, the kexec_file_load syscall on x86 does not support passing a
> >>> device tree blob to the new kernel. Some embedded x86 systems use device
> >>> trees. On these systems, failing to pass a device tree to the new kernel
> >>> causes a boot failure.
> >>>
> >>> To add support for this, we copy the behavior of ARM64 and PowerPC and
> >>> copy the current boot's device tree blob for use in the new kernel. We do
> >>> this on x86 by passing the device tree blob as a setup_data entry in
> >>> accordance with the x86 boot protocol.
> >>>
> >>> This behavior is gated behind the KEXEC_FILE_FORCE_DTB flag.
> >>>
> >>> Link: https://urldefense.com/v3/__https://lkml.kernel.org/r/20250805211527.122367-3-makb@juniper.net__;!!NEt6yMaO-gk!EbJyF8xO2E51MyYdN3_zqCBBMj0JKoiKoPuG_8vEctQMk9uCyjX0LdSEH_FGkPDV8egxzc7w$
> >>> Signed-off-by: Brian Mak <makb at juniper.net>
> >>> Cc: Alexander Graf <graf at amazon.com>
> >>> Cc: Baoquan He <bhe at redhat.com>
> >>> Cc: Borislav Betkov <bp at alien8.de>
> >>> Cc: Dave Young <dyoung at redhat.com>
> >>> Cc: "H. Peter Anvin" <hpa at zytor.com>
> >>> Cc: Ingo Molnar <mingo at redhat.com>
> >>> Cc: Rob Herring <robh at kernel.org>
> >>> Cc: Saravana Kannan <saravanak at google.com>
> >>> Cc: Thomas Gleinxer <tglx at linutronix.de>
> >>> Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
> >>> ---
> >>>
> >>> arch/x86/kernel/kexec-bzimage64.c | 47 ++++++++++++++++++++++++++--
> >>> include/linux/kexec.h | 5 ++
> >>> include/uapi/linux/kexec.h | 4 ++
> >>> kernel/kexec_file.c | 1
> >>> 4 files changed, 53 insertions(+), 4 deletions(-)
> >>>
> >>> --- a/arch/x86/kernel/kexec-bzimage64.c~x86-kexec-carry-forward-the-boot-dtb-on-kexec
> >>> +++ a/arch/x86/kernel/kexec-bzimage64.c
> >>> @@ -16,6 +16,8 @@
> >>> #include <linux/kexec.h>
> >>> #include <linux/kernel.h>
> >>> #include <linux/mm.h>
> >>> +#include <linux/libfdt.h>
> >>> +#include <linux/of_fdt.h>
> >>> #include <linux/efi.h>
> >>> #include <linux/random.h>
> >>>
> >>> @@ -212,6 +214,28 @@ setup_efi_state(struct boot_params *para
> >>> }
> >>> #endif /* CONFIG_EFI */
> >>>
> >>> +#ifdef CONFIG_OF_FLATTREE
> >>> +static void setup_dtb(struct boot_params *params,
> >>> + unsigned long params_load_addr,
> >>> + unsigned int dtb_setup_data_offset)
> >>> +{
> >>> + struct setup_data *sd = (void *)params + dtb_setup_data_offset;
> >>> + unsigned long setup_data_phys, dtb_len;
> >>> +
> >>> + dtb_len = fdt_totalsize(initial_boot_params);
> >>> + sd->type = SETUP_DTB;
> >>> + sd->len = dtb_len;
> >>> +
> >>> + /* Carry over current boot DTB with setup_data */
> >>> + memcpy(sd->data, initial_boot_params, dtb_len);
> >>> +
> >>> + /* Add setup data */
> >>> + setup_data_phys = params_load_addr + dtb_setup_data_offset;
> >>> + sd->next = params->hdr.setup_data;
> >>> + params->hdr.setup_data = setup_data_phys;
> >>> +}
> >>> +#endif /* CONFIG_OF_FLATTREE */
> >>> +
> >>> static void
> >>> setup_ima_state(const struct kimage *image, struct boot_params *params,
> >>> unsigned long params_load_addr,
> >>> @@ -336,6 +360,17 @@ setup_boot_parameters(struct kimage *ima
> >>> sizeof(struct efi_setup_data);
> >>> #endif
> >>>
> >>> +#ifdef CONFIG_OF_FLATTREE
> >>> + if (image->force_dtb && initial_boot_params) {
> >>> + setup_dtb(params, params_load_addr, setup_data_offset);
> >>> + setup_data_offset += sizeof(struct setup_data) +
> >>> + fdt_totalsize(initial_boot_params);
> >>> + } else {
> >>> + pr_debug("Not carrying over DTB, force_dtb = %d\n",
> >>> + image->force_dtb);
> >>> + }
> >>> +#endif
> >>> +
> >>> if (IS_ENABLED(CONFIG_IMA_KEXEC)) {
> >>> /* Setup IMA log buffer state */
> >>> setup_ima_state(image, params, params_load_addr,
> >>> @@ -529,6 +564,12 @@ static void *bzImage64_load(struct kimag
> >>> sizeof(struct setup_data) +
> >>> RNG_SEED_LENGTH;
> >>>
> >>> +#ifdef CONFIG_OF_FLATTREE
> >>> + if (image->force_dtb && initial_boot_params)
> >>> + kbuf.bufsz += sizeof(struct setup_data) +
> >>> + fdt_totalsize(initial_boot_params);
> >>> +#endif
> >>> +
> >>> if (IS_ENABLED(CONFIG_IMA_KEXEC))
> >>> kbuf.bufsz += sizeof(struct setup_data) +
> >>> sizeof(struct ima_setup_data);
> >>> @@ -537,7 +578,7 @@ static void *bzImage64_load(struct kimag
> >>> kbuf.bufsz += sizeof(struct setup_data) +
> >>> sizeof(struct kho_data);
> >>>
> >>> - params = kzalloc(kbuf.bufsz, GFP_KERNEL);
> >>> + params = kvzalloc(kbuf.bufsz, GFP_KERNEL);
> >>
> >> Wondering how big the dtb blob is, can you explain a little bit about
> >> the kvzalloc usage here?
> >>
> >> Except of this, I have no other concern about this patch.
> >>
> >> And what's your plan about the user space kexec-tool change?
> >
> > When I tested this earlier on x86, the DTB was allowed to be up to just
> > under 64 pages large before the DTB failed to load. This is because it
> > has to fit into an early_memremap() mapping (relevant code snippet at
> > the bottom). Since the allocation can be many pages, I changed the
> > kzalloc to a kvzalloc.
> >
> > For the kexec-tools change, I have a draft change that I've already
> > shared on this thread for testing purposes. I believe you said you were
> > going to test it, but I haven't heard anything back from that yet. I'll
> > raise that change for review properly once this kernel commit is in
> > mainline.
> >
> > ---------
> >
> > void __init x86_flattree_get_config(void)
> > {
> > #ifdef CONFIG_OF_EARLY_FLATTREE
> > u32 size, map_len;
> > void *dt;
> >
> > if (initial_dtb) {
> > map_len = max(PAGE_SIZE - (initial_dtb & ~PAGE_MASK), (u64)128);
> >
> > dt = early_memremap(initial_dtb, map_len);
> > size = fdt_totalsize(dt);
> > if (map_len < size) {
> > early_memunmap(dt, map_len);
> > dt = early_memremap(initial_dtb, size);
> > map_len = size;
> > }
> >
> > early_init_dt_verify(dt, __pa(dt));
> > }
> >
> > unflatten_and_copy_device_tree();
> >
> > if (initial_dtb)
> > early_memunmap(dt, map_len);
> > #endif
> > if (acpi_disabled && of_have_populated_dt())
> > x86_init.mpparse.parse_smp_cfg = x86_dtb_parse_smp_config;
> > }
> >
> > ---------
> >
> > Thanks,
> > Brian
>
> Hi Baoquan,
>
> Just wanted to check back on this. Did you have any further concerns? If
> not, would you be able to provide an ACK?
I thought I have acked it, seems I only ack-ed the 1st one, then ack the
2nd patch:
Acked-by: Baoquan He <bhe at redhat.com>
More information about the kexec
mailing list