[PATCH v3 2/3] drivers/firmware: add SDEI cross-CPU NMI service for arm64
Puranjay Mohan
puranjay12 at gmail.com
Mon Jun 15 03:18:10 PDT 2026
On Mon, Jun 15, 2026 at 4:35 AM Kiryl Shutsemau <kirill at shutemov.name> wrote:
>
> From: "Kiryl Shutsemau (Meta)" <kas at kernel.org>
>
> Deliver an NMI-like event to an interrupt-masked arm64 CPU via the
> standard SDEI software-signalled event (event 0), without the pseudo-NMI
> hot-path cost: register a handler for event 0 and poke a target with
> sdei_event_signal(0, mpidr).
>
> First user is arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls,
> hung-task/soft-lockup dumps), which otherwise rides an IPI that can't
> reach a masked CPU. Falls back to the IPI path when SDEI is absent; no
> watchdog backend yet, so the stock detector is untouched.
>
> Signed-off-by: Kiryl Shutsemau (Meta) <kas at kernel.org>
> Reviewed-by: Douglas Anderson <dianders at chromium.org>
> ---
> MAINTAINERS | 2 +-
> arch/arm64/include/asm/nmi.h | 24 +++++
> arch/arm64/kernel/smp.c | 11 +++
> drivers/firmware/Kconfig | 19 ++++
> drivers/firmware/Makefile | 1 +
> drivers/firmware/arm_sdei_nmi.c | 149 ++++++++++++++++++++++++++++++++
> 6 files changed, 205 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/nmi.h
> create mode 100644 drivers/firmware/arm_sdei_nmi.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c8d4b913f26c..b5ddfb85dce9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -24797,7 +24797,7 @@ M: James Morse <james.morse at arm.com>
> L: linux-arm-kernel at lists.infradead.org (moderated for non-subscribers)
> S: Maintained
> F: Documentation/devicetree/bindings/arm/firmware/sdei.txt
> -F: drivers/firmware/arm_sdei.c
> +F: drivers/firmware/arm_sdei*
> F: include/linux/arm_sdei.h
> F: include/uapi/linux/arm_sdei.h
>
> diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
> new file mode 100644
> index 000000000000..9366be419d18
> --- /dev/null
> +++ b/arch/arm64/include/asm/nmi.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_NMI_H
> +#define __ASM_NMI_H
> +
> +#include <linux/cpumask.h>
> +
> +/*
> + * Cross-CPU NMI provider hooks, consulted by the arm64 arch code before
> + * its regular-IRQ / pseudo-NMI IPI paths. The SDEI provider in
> + * drivers/firmware/arm_sdei_nmi.c implements them when active; a future
> + * FEAT_NMI provider could slot in here too. The stubs let callers stay
> + * unconditional when ARM_SDEI_NMI is off.
> + */
> +#ifdef CONFIG_ARM_SDEI_NMI
> +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu);
> +#else
> +static inline bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
> + int exclude_cpu)
> +{
> + return false;
> +}
> +#endif
> +
> +#endif /* __ASM_NMI_H */
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1aa324104afb..a670434a8cae 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -45,6 +45,7 @@
> #include <asm/daifflags.h>
> #include <asm/kvm_mmu.h>
> #include <asm/mmu_context.h>
> +#include <asm/nmi.h>
> #include <asm/numa.h>
> #include <asm/processor.h>
> #include <asm/smp_plat.h>
> @@ -927,6 +928,16 @@ static void arm64_backtrace_ipi(cpumask_t *mask)
>
> void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
> {
> + /*
> + * Prefer the SDEI cross-CPU NMI provider when active: firmware
> + * dispatches the event out of EL3 and reaches CPUs that have
> + * interrupts locally masked, without the per-IRQ-mask cost that
> + * pseudo-NMI pays for the same reach. The plain IPI path below
> + * can't reach such a CPU unless pseudo-NMI is enabled.
> + */
> + if (sdei_nmi_trigger_cpumask_backtrace(mask, exclude_cpu))
> + return;
> +
> /*
> * NOTE: though nmi_trigger_cpumask_backtrace() has "nmi_" in the name,
> * nothing about it truly needs to be implemented using an NMI, it's
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index bbd2155d8483..6501087ff90d 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -36,6 +36,25 @@ config ARM_SDE_INTERFACE
> standard for registering callbacks from the platform firmware
> into the OS. This is typically used to implement RAS notifications.
>
> +config ARM_SDEI_NMI
> + bool "SDEI-based cross-CPU NMI service (arm64)"
> + depends on ARM64 && ARM_SDE_INTERFACE
> + help
> + Provides SDEI-based cross-CPU NMI delivery for hooks that need
> + to reach interrupt-masked CPUs on silicon that lacks FEAT_NMI:
> +
> + - arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls,
> + hardlockup_all_cpu_backtrace, soft-lockup secondary dumps,
> + hung-task auxiliary dumps)
> +
> + The driver registers a handler for the SDEI software-signalled
> + event (event 0) and reaches a target CPU by signalling it with
> + SDEI_EVENT_SIGNAL. Firmware delivers the event out of EL3
> + regardless of the target's PSTATE.DAIF -- forced delivery into a
> + CPU wedged with interrupts locally masked.
> +
> + If unsure, say N.
> +
> config EDD
> tristate "BIOS Enhanced Disk Drive calls determine boot disk"
> depends on X86
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 4ddec2820c96..be46f1e1dc77 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -4,6 +4,7 @@
> #
> obj-$(CONFIG_ARM_SCPI_PROTOCOL) += arm_scpi.o
> obj-$(CONFIG_ARM_SDE_INTERFACE) += arm_sdei.o
> +obj-$(CONFIG_ARM_SDEI_NMI) += arm_sdei_nmi.o
> obj-$(CONFIG_DMI) += dmi_scan.o
> obj-$(CONFIG_DMI_SYSFS) += dmi-sysfs.o
> obj-$(CONFIG_EDD) += edd.o
> diff --git a/drivers/firmware/arm_sdei_nmi.c b/drivers/firmware/arm_sdei_nmi.c
> new file mode 100644
> index 000000000000..a82776e7b55a
> --- /dev/null
> +++ b/drivers/firmware/arm_sdei_nmi.c
> @@ -0,0 +1,149 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * arm64 SDEI-based cross-CPU NMI service.
> + *
> + * Delivering an "NMI-shaped" event to an EL1 context that has locally
> + * masked interrupts, on silicon without FEAT_NMI, can be done two ways:
> + *
> + * - pseudo-NMI: mask "interrupts" via the GIC priority register
> + * (ICC_PMR_EL1) instead of PSTATE.DAIF, leaving a high-priority band
> + * deliverable. Functionally this works -- but it reimplements every
> + * local_irq_disable()/enable() and exception entry/exit as a PMR
> + * write plus synchronisation, a cost paid on that hot path forever,
> + * whether or not an NMI is ever delivered.
> + *
> + * - SDEI: leave interrupt masking as the cheap PSTATE.DAIF operation
> + * and have the firmware bounce an EL3-routed Group-0 SGI back to
> + * NS-EL1 as an event callback. The cost is a firmware round-trip,
> + * but only at the rare moment delivery is actually needed.
> + *
> + * This driver takes the second path: it keeps the IRQ-mask hot path
> + * free and pays only when it fires, which is what makes cross-CPU NMI
> + * affordable on hardware where the pseudo-NMI tax isn't, until FEAT_NMI
> + * makes NMI masking cheap in the architecture itself.
> + *
> + * Capabilities provided:
> + *
> + * - sdei_nmi_trigger_cpumask_backtrace() — override for arm64's
> + * arch_trigger_cpumask_backtrace(), so sysrq-l, RCU stall dumps,
> + * hardlockup_all_cpu_backtrace, soft-lockup/hung-task secondary
> + * dumps all reach interrupt-masked CPUs.
> + *
> + * Delivery uses the standard SDEI software-signalled event (event 0) and
> + * SDEI_EVENT_SIGNAL. We register a handler for event 0, enable it, and
> + * poke a target CPU with sdei_event_signal(0, mpidr): firmware makes
> + * event 0 pending on that PE and dispatches the handler NMI-like,
> + * regardless of the target's DAIF.
> + * Availability is simply whether event 0 registers and enables -- if SDEI
> + * and its software-signalled event are present we use it, otherwise the
> + * driver stays inert.
> + */
> +
> +#define pr_fmt(fmt) "sdei_nmi: " fmt
> +
> +#include <linux/arm_sdei.h>
> +#include <linux/cpumask.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/kprobes.h>
> +#include <linux/nmi.h>
> +#include <linux/printk.h>
> +#include <linux/ptrace.h>
> +#include <linux/smp.h>
> +#include <linux/types.h>
> +
> +#include <asm/nmi.h>
> +#include <asm/smp_plat.h>
> +
> +static bool sdei_nmi_available;
> +
> +#define SDEI_NMI_EVENT 0
> +
> +static int sdei_nmi_handler(u32 event, struct pt_regs *regs, void *arg)
> +{
> + /*
> + * nmi_cpu_backtrace() no-ops unless this CPU's bit is set in the
> + * global backtrace mask (driven by nmi_trigger_cpumask_backtrace()),
> + * so a fire that reaches a CPU not being backtraced is harmless.
> + */
> + nmi_cpu_backtrace(regs);
> + return SDEI_EV_HANDLED;
> +}
> +NOKPROBE_SYMBOL(sdei_nmi_handler);
> +
> +static void sdei_nmi_fire(unsigned int target_cpu)
> +{
> + int err = sdei_event_signal(SDEI_NMI_EVENT, cpu_logical_map(target_cpu));
> +
> + if (err)
> + pr_warn("SDEI_EVENT_SIGNAL to CPU %u failed: %d\n",
> + target_cpu, err);
> +}
> +
> +/*
> + * Raise callback for nmi_trigger_cpumask_backtrace(): signal event 0
> + * at every CPU still pending in @mask. The framework excludes the local
> + * CPU from @mask before calling us.
> + */
> +static void sdei_nmi_raise_backtrace(cpumask_t *mask)
> +{
> + unsigned int cpu;
> +
> + for_each_cpu(cpu, mask)
> + sdei_nmi_fire(cpu);
> +}
> +
> +/*
> + * Override hook for arch_trigger_cpumask_backtrace() (see
> + * arch/arm64/kernel/smp.c). Returns true when SDEI handled the request,
> + * which is the case whenever SDEI is active; on a false return the arch
> + * falls back to its regular-IRQ (or pseudo-NMI, if enabled) IPI.
> + *
> + * On a kernel built without paying the pseudo-NMI hot-path cost (the
> + * usual case for this driver's target), the IPI can't reach a CPU that
> + * has interrupts masked -- so the backtrace of the one CPU you care
> + * about comes back empty. SDEI is dispatched out of EL3 and lands
> + * regardless of the target's DAIF, without taxing the IRQ-mask path.
> + */
> +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
> +{
> + if (!sdei_nmi_available)
> + return false;
> +
> + nmi_trigger_cpumask_backtrace(mask, exclude_cpu,
> + sdei_nmi_raise_backtrace);
> + return true;
> +}
> +
> +/*
> + * device_initcall (after arch_initcall(sdei_init), so the SDEI subsystem
> + * is up): probe the firmware, register the event, and turn on the
> + * cross-CPU service. If the probe fails the driver stays inert and the
> + * override hooks decline, leaving the arch's own paths in place.
> + */
> +static int __init sdei_nmi_init(void)
> +{
> + int err;
> +
> + err = sdei_event_register(SDEI_NMI_EVENT, sdei_nmi_handler, NULL);
> + if (err) {
> + pr_err("sdei_event_register(%u) failed: %d\n",
> + SDEI_NMI_EVENT, err);
> + return 0;
> + }
This initcall runs unconditionally whenever ARM_SDEI_NMI is built in,
which includes the many arm64 systems that have no SDEI at all. On
those, sdei_event_register() -> sdei_event_create() ->
invoke_sdei_fn() returns -EIO, and the core already complains:
pr_warn("Failed to create event %u: %d\n", event_num, err);
(that one isn't gated on err != -EIO, unlike sdei_mask_local_cpu() & friends).
We then add a second pr_err() on top, so every boot on a non-SDEI box
with this config gets two alarming lines for what is just "no firmware
support". At minimum, don't shout for -EIO/-EOPNOTSUPP here. Better,
skip the probe when SDEI isn't present there's no exported predicate
today, but -EIO is the de-facto one.
> + err = sdei_event_enable(SDEI_NMI_EVENT);
> + if (err) {
> + pr_err("sdei_event_enable(%u) failed: %d\n",
> + SDEI_NMI_EVENT, err);
> + sdei_event_unregister(SDEI_NMI_EVENT);
> + return 0;
> + }
> +
> + sdei_nmi_available = true;
> + pr_info("using SDEI cross-CPU NMI (SDEI_EVENT_SIGNAL, event %u)\n",
> + SDEI_NMI_EVENT);
> +
> + return 0;
> +}
> +device_initcall(sdei_nmi_init);
> --
> 2.54.0
>
More information about the linux-arm-kernel
mailing list