[PATCH 4/4] arm64: route crash_smp_send_stop() last resort through SDEI
Doug Anderson
dianders at chromium.org
Fri Jun 5 13:42:57 PDT 2026
Hi,
On Wed, Jun 3, 2026 at 7:36 AM Kiryl Shutsemau <kirill at shutemov.name> wrote:
>
> @@ -1288,8 +1288,32 @@ void crash_smp_send_stop(void)
> return;
> crash_stop = 1;
>
> + /*
> + * Stop the normal way first: IPI_CPU_STOP escalating to a pseudo-NMI
> + * IPI. Every CPU that responds saves its state via crash_save_cpu()
> + * and parks in cpu_park_loop() with its online bit cleared -- the
> + * standard kdump stop, identical to a kernel without SDEI. Crucially
> + * those CPUs stay in a clean, potentially-reusable state.
> + */
> smp_send_stop();
>
> + /*
> + * Whatever is still online didn't respond -- typically a CPU wedged
> + * with interrupts masked. The plain IPI can't reach it, and a fleet
> + * that declines the pseudo-NMI hot-path cost has no NMI IPI to
> + * escalate to. Hit only the survivors with the SDEI cross-CPU NMI
> + * (no-op if SDEI isn't active, or if everything already stopped):
> + * firmware delivers out of EL3 regardless of PSTATE.DAIF, and the
> + * handler captures crash_save_cpu() state from the wedged context
> + * before parking the CPU.
> + *
> + * SDEI is deliberately last: an SDEI-stopped CPU never completes its
> + * event (it parks inside the handler, so EL3 retains its dispatch
> + * slot until reset), which is strictly less recoverable than a normal
> + * stop. We pay that only for CPUs that left no other way to reach them.
> + */
> + sdei_nmi_crash_smp_send_stop();
It feels weird to me that you're adding SDEI for "crash stop" but not
for regular "stop". It feels like you should modify smp_send_stop() to
fall back to SDEI if sending the NMI failed, instead of adding this
separate path.
> static int sdei_nmi_handler(u32 event, struct pt_regs *regs, void *arg)
> {
> + int cpu = smp_processor_id();
> +
> + if (READ_ONCE(*this_cpu_ptr(&sdei_nmi_crash_stop_requested))) {
> + WRITE_ONCE(*this_cpu_ptr(&sdei_nmi_crash_stop_requested), 0);
> +
> + /*
> + * Capture the wedged context for kdump while pt_regs still
> + * points at the interrupted PC. This is the main motivation
> + * for using SDEI here: the plain IPI stop path can't reach an
> + * interrupt-masked CPU (and the fleet declines pseudo-NMI to
> + * keep the IRQ-mask hot path cheap), so crash_save_cpu() for
> + * that CPU would otherwise record nothing useful.
> + */
> + crash_save_cpu(regs, cpu);
> + set_cpu_online(cpu, false);
> +
> + /* publish the crash state/offline before the requester sees the ack */
> + smp_wmb();
> + WRITE_ONCE(*this_cpu_ptr(&sdei_nmi_crash_stop_acked), 1);
> +
> + /*
> + * Park forever from within the SDEI handler. We deliberately
> + * do NOT issue SDEI_EVENT_COMPLETE: the framework's return
> + * path restores firmware's saved interrupted context, which
> + * would land the CPU back wherever it was running (often
> + * do_idle, which then notices cpu_is_offline=true and BUGs
> + * at cpuhp_report_idle_dead). Returning the modified pt_regs
> + * doesn't help -- arch/arm64/kernel/sdei.c::do_sdei_event
> + * only honours a PC override via its IRQ-state heuristic
> + * and otherwise hands EL3 its own saved-context slot back.
> + *
> + * Trade-off: EL3 firmware retains ~one saved-context slot
> + * per parked CPU until the next hardware reset (~hundreds of
> + * bytes per CPU). The CPU itself is parked in cpu_park_loop
> + * exactly as if IPI_CPU_STOP had stopped it; recoverability
> + * is unchanged versus the existing path (neither is
> + * recoverable without hardware reset, since PSCI sees the
> + * CPU as ALREADY_ON in both cases).
> + */
> + cpu_park_loop();
> + /* unreachable */
Any chance we could avoid duplicating stuff from ipi_cpu_crash_stop()?
> +bool sdei_nmi_crash_smp_send_stop(void)
> +{
> + unsigned int this_cpu, cpu, remaining;
> + unsigned long timeout;
> + cpumask_t mask;
The above will probably get you a yell. Putting "cpumask_t" on the
stack is a no-no since it can be quite large under certain CONFIG
options. This is why it's nearly always defined as "static".
-Doug
More information about the kexec
mailing list