[PATCH v3 6/6] lib: sbi: hart: Detect and enable Smrnmi before trap-based feature detection
Evgeny Voevodin
evvoevod at tenstorrent.com
Thu May 14 13:27:03 PDT 2026
On Tue May 12, 2026 at 3:40 PM UTC, Anup Patel wrote:
> On Tue, May 12, 2026 at 2:00 PM Nylon Chen <nylon.chen at sifive.com> wrote:
> >
> > Anup Patel <anup at brainfault.org> 於 2026年5月12日週二 下午3:59寫道:
> > >
> > > On Tue, May 12, 2026 at 12:14 PM Nylon Chen <nylon.chen at sifive.com> wrote:
> > > >
> > > > Anup Patel <anup at brainfault.org> 於 2026年5月9日週六 下午2:51寫道:
> > > >
> > > > >
> > > > > On Thu, May 7, 2026 at 11:38 PM Evgeny Voevodin
> > > > > <evvoevod at tenstorrent.com> wrote:
> > > > > >
> > > > > > The location of the RNMI/E trap vectors in the Smrnmi extension is
> > > > > > implementation-defined, so platforms with vendor-specific NMI vector
> > > > > > mechanisms must install the firmware's NMI entry points themselves.
> > > > > >
> > > > > > Add an smrnmi_handlers_init() callback to sbi_platform_operations that
> > > > > > receives the firmware entry points and lets platform code install them
> > > > > > at the hardware-specific vector locations. Two pointers are passed:
> > > > > >
> > > > > > - _trap_rnmi_handler: the dedicated RNMI entry point that saves
> > > > > > context using the Smrnmi MN* CSRs and returns via mnret.
> > > > > > - _trap_handler: the regular M-mode trap entry since RNME is taken
> > > > > > as a regular M-mode trap with NMIE=0.
> > > > > >
> > > > > > When Smrnmi is present, install the platform's NMI vectors via the new
> > > > > > callback, initialize MNSCRATCH with the per-hart scratch pointer, and
> > > > > > set MNSTATUS.NMIE.
> > > > > >
> > > > > > Smrnmi-enabled platforms must register smrnmi_handlers_init; if the
> > > > > > extension is detected but no callback is registered, sbi_panic() is
> > > > > > called since enabling NMIs without handlers in place would route
> > > > > > subsequent traps into nowhere.
> > > > > >
> > > > > > Signed-off-by: Evgeny Voevodin <evvoevod at tenstorrent.com>
> > > > >
> > > > > LGTM.
> > > > >
> > > > > Reviewed-by: Anup Patel <anup at brainfault.org>
> > > > >
> > > > > Regards,
> > > > > Anup
> > > > >
> > > > > > ---
> > > > > > include/sbi/sbi_platform.h | 4 ++++
> > > > > > lib/sbi/sbi_hart.c | 20 ++++++++++++++++++++
> > > > > > 2 files changed, 24 insertions(+)
> > > > > >
> > > > > > diff --git a/include/sbi/sbi_platform.h b/include/sbi/sbi_platform.h
> > > > > > index 715df499..fe382b56 100644
> > > > > > --- a/include/sbi/sbi_platform.h
> > > > > > +++ b/include/sbi/sbi_platform.h
> > > > > > @@ -150,6 +150,10 @@ struct sbi_platform_operations {
> > > > > > /** platform specific pmp disable on current HART */
> > > > > > void (*pmp_disable)(unsigned int n);
> > > > > >
> > > > > > + /** platform specific Smrnmi handlers init on current HART */
> > > > > > + void (*smrnmi_handlers_init)(void (*rnmi_handler)(void),
> > > > > > + void (*rnme_handler)(void));
> > > > > > +
> > > > > > /** platform specific Smrnmi NMI handler.
> > > > > > * Returns SBI_SUCCESS on success, error code if NMI cannot be handled. */
> > > > > > int (*rnmi_handler)(struct sbi_trap_context *tcntx);
> > > > > > diff --git a/lib/sbi/sbi_hart.c b/lib/sbi/sbi_hart.c
> > > > > > index 781161e5..92c602aa 100644
> > > > > > --- a/lib/sbi/sbi_hart.c
> > > > > > +++ b/lib/sbi/sbi_hart.c
> > > > > > @@ -532,6 +532,26 @@ static int hart_detect_features(struct sbi_scratch *scratch)
> > > > > > if (rc)
> > > > > > return rc;
> > > > > >
> > > > > > + if (sbi_hart_has_extension(scratch, SBI_HART_EXT_SMRNMI)) {
> > > > > > + const struct sbi_platform *plat = sbi_platform_thishart_ptr();
> > > > > > + const struct sbi_platform_operations *ops = sbi_platform_ops(plat);
> > > > > > + extern void _trap_rnmi_handler(void);
> > > > > > + extern void _trap_handler(void);
> > > > > > +
> > > > > > + if (!ops || !ops->smrnmi_handlers_init)
> > > > > > + sbi_panic("Smrnmi detected, but platform lacks smrnmi_handlers_init callback\n");
> > > > > > +
> > > > > > + /* Reuse _trap_handler for the RNME slot since RNME is taken
> > > > > > + * as a regular M-mode trap with NMIE=0. */
> > > > > > + ops->smrnmi_handlers_init(_trap_rnmi_handler, _trap_handler);
> > > > > > +
> > > > > > + /* Initialize MNSCRATCH for the RNMI handler */
> > > > > > + csr_write(CSR_MNSCRATCH, scratch);
> > > > > > +
> > > > > > + /* Enable NMIs */
> > > > > > + csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
> > > > Two concerns about this block
> > > >
> > > > First, sbi_panic() when smrnmi_handlers_init is NULL is too strict.
> > > > Not all platforms with Smrnmi need to program a vendor-specific NMI
> > > > vector register.
> > > > Some implementations have a fixed NMI vector address or route NMIs
> > > > through mtvec by default; for these, setting MNSCRATCH and NMIE is
> > > > sufficient and no platform callback is needed.
> > > >
> > > > This can be reproduced with QEMU virt:
> > > >
> > > > qemu-system-riscv64 -M virt -cpu rv64,smrnmi=on -m 256m \
> > > > -bios fw_jump.bin -nographic -no-reboot
> > > >
> > > > The firmware hangs with zero output. The panic fires before
> > > > sbi_platform_early_init() so the console is not yet initialized and
> > > > the message is silently dropped.
Agree, silent hang regresses platforms which don't need to setup Rnmi/e
handlers explicitly and just use default vectors.
> > > >
> > > > Adding a no-op stub to the generic platform confirms the panic is the
> > > > sole cause -- boot proceeds normally with the stub in place.
> > > >
> > > > Second, the callback returns void, so there is no way to detect
> > > > whether the platform succeeded in programming the NMI vector.
> > > > csr_set(MNSTATUS, NMIE) then runs unconditionally.
> > > > If the callback fails silently, NMIs are enabled but the hardware
> > > > vector register is left at its reset value. The first NMI would jump
> > > > to an undefined address.
> > > >
> > > > Suggested fix:
> > > >
> > > > /* allow NULL: platforms with fixed or mtvec-based NMI vectors
> > > > * do not need to program any vendor register */
> > > > if (ops && ops->smrnmi_handlers_init) {
> > > > int ret = ops->smrnmi_handlers_init(_trap_rnmi_handler,
> > > > _trap_handler);
> > > > if (ret)
> > > > return ret;
> > > > }
> > > >
> > > > csr_write(CSR_MNSCRATCH, scratch);
> > > > csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
> > > >
> > > > This requires changing the callback signature from void to int.
> > > >
> > > > Platforms that have nothing to program return 0, platforms that fail
> > > > return an error, and platforms with no callback at all are handled
> > > > gracefully without a panic.
What I would suggest here:
Instead of a stub handler provided by a platform which doesn't require
handlers setup, let's introduce a flag that will simply indicate if
platform provided handlers setup callback or not. Later, when console
becomes available, just print the informational line in the boot banner:
Smrnmi subsystem : Default <-- or "Platform"
This will help to not regress existing platforms and also provide some
diagnostics for those platforms which require handlers initialization
but missed that. Sketch implementation could be:
enum {
SMRNMI_INIT_NONE,
SMRNMI_INIT_DEFAULT,
SMRNMI_INIT_PLATFORM,
};
static int smrnmi_init_mode;
if (sbi_hart_has_extension(scratch, SBI_HART_EXT_SMRNMI)) {
if (ops && ops->smrnmi_handlers_init) {
int ret = ops->smrnmi_handlers_init(_trap_rnmi_handler,
_trap_handler);
if (ret)
return ret;
smrnmi_init_mode = SMRNMI_INIT_PLATFORM;
} else {
smrnmi_init_mode = SMRNMI_INIT_DEFAULT;
}
csr_write(CSR_MNSCRATCH, scratch);
csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
}
And in the boot info printer (sbi_boot_print_general_info() or
wherever the banner lives):
if (smrnmi_init_mode != SMRNMI_INIT_NONE)
sbi_printf("Smrnmi subsystem : %s\n",
smrnmi_init_mode == SMRNMI_INIT_PLATFORM
? "Platform" : "Default");
> > > Please send a patch with appropriate Fixes tag.
> > Will do.
> >
> > While preparing the patch, I also noticed the series is missing
> > non-retentive suspend/resume handling for CSR_MNSCRATCH and
> > MNSTATUS.NMIE.
> >
> > After non-retentive suspend, the hart is fully powered off and all CSR
> > state is reset.
> >
> > On resume, sbi_hsm_hart_resume_finish() calls
> > __sbi_hsm_suspend_non_ret_restore() and jumps directly to the kernel
> > without going through sbi_hart_init(), so the Smrnmi CSRs set up in
> > hart_detect_features() are never re-initialized.
>
> I think Smrnmi in hart_detect_features() should be moved to a
> separate function which should be called from two places:
> 1) At the current location from hart_detect_features() before
> we use trap-based extension detection
> 2) In sbi_hart_init() for both cold boot and warm boot path
>
> Regards,
> Anup
I think __sbi_hsm_suspend_non_ret_save/restore is the right place to
preserve and restore MNSCRATCH and MNSTATUS CSRs as recovery from
non-retentive state bypasses sbi_hart_init(). Also, these functions are
placeholders for other CSRs save/restore which don't survive during
non-retentive entry.
Thanks,
Evgeny
More information about the opensbi
mailing list