[PATCH v4 0/4] arm64: cross-CPU NMI via SDEI

Doug Anderson dianders at chromium.org
Mon Jun 22 09:52:40 PDT 2026


Hi,

On Mon, Jun 22, 2026 at 6:56 AM Kiryl Shutsemau <kirill at shutemov.name> wrote:
>
> The ratio is flat across the whole 1-to-72 sweep, so -- relevant to the
> scalability question -- it's a constant per-syscall tax, not a contention
> effect. The impact tracks syscall/exception density: page_fault1, a more
> realistic workload, stays within ~5%.

FWIW, I'm not sure 5% is truly a "realistic" measure of the overhead
here. While it cannot be denied that pseudo-NMI has some overhead,
every time I've seen someone study it in "real world" scenarios it
ends up being fairly negligible. Not zero, mind you. ...but some
amount below 1%.  ...and by this, I mean put real-world workloads on a
system with pseudo NMI enabled and pick your favorite metric. Perhaps
measure the amount of work completed in a given time. I dunno if this
is possible for you to do in this case.


> > The direction of travel is to deprecate SDEI. I wouldn't add more stuff
> > on top of this interface.
>
> I understand FEAT_NMI is the long-term answer, and I'm not arguing against
> deprecating SDEI. My concern is the gap in between. By our estimate it's
> 10+ years before the last non-FEAT_NMI machine retires from the fleet --
> for scale, we're still running Skylake today. So there's roughly a
> decade where a large installed base has neither FEAT_NMI nor affordable
> pseudo-NMI, and no way to reach a DAIF-masked CPU for an all-CPU
> backtrace or to capture a wedged CPU in a crash dump. That's the
> functional gap this series tries to cover.
>
> Given the deprecation direction, I deliberately kept the SDEI footprint as
> small as I could. The series adds no new firmware interface and no vendor
> SMC -- it uses only the standard software-signalled event (event 0) via
> SDEI_EVENT_SIGNAL, which is already present on these systems for
> firmware-first RAS (APEI/GHES). And SDEI is only ever invoked in a "bad
> state": to deliver a backtrace signal to a CPU that a normal IPI can't
> reach, or to stop a CPU that ignored the stop IPIs. Nothing on any hot or
> steady-state path touches it.
>
> If even that minimal use is unacceptable on a deprecated interface, I'd
> rather know now and redirect the effort -- but I'd appreciate a pointer to
> what should cover this gap for existing silicon in the meantime.

FWIW, despite the fact that pseudo-NMI (in my experience) ends up
being mostly negligible, personally I'd be in favor of landing Kiryl's
patches. I dunno how many times I've had the discussion of pseudo-NMI
overhead over the years and it's certainly very easy to show that
there is _some_ overhead and that in pathological cases there is a lot
of overhead. As Kiryl says, the patches don't add a ton of extra
complexity and they even combine some of the stop/crash-stop code,
which is nice. Having them as a stop-gap until true NMI is available
seems nice to me. ...of course, I'm not an ARM maintainer, so it's
obviously not up to me. :-)

-Doug



More information about the linux-arm-kernel mailing list