[PATCH v3 6/6] lib: sbi: hart: Detect and enable Smrnmi before trap-based feature detection
Nylon Chen
nylon.chen at sifive.com
Tue May 12 01:25:08 PDT 2026
Anup Patel <anup at brainfault.org> 於 2026年5月12日週二 下午3:59寫道:
>
> On Tue, May 12, 2026 at 12:14 PM Nylon Chen <nylon.chen at sifive.com> wrote:
> >
> > Anup Patel <anup at brainfault.org> 於 2026年5月9日週六 下午2:51寫道:
> >
> > >
> > > On Thu, May 7, 2026 at 11:38 PM Evgeny Voevodin
> > > <evvoevod at tenstorrent.com> wrote:
> > > >
> > > > The location of the RNMI/E trap vectors in the Smrnmi extension is
> > > > implementation-defined, so platforms with vendor-specific NMI vector
> > > > mechanisms must install the firmware's NMI entry points themselves.
> > > >
> > > > Add an smrnmi_handlers_init() callback to sbi_platform_operations that
> > > > receives the firmware entry points and lets platform code install them
> > > > at the hardware-specific vector locations. Two pointers are passed:
> > > >
> > > > - _trap_rnmi_handler: the dedicated RNMI entry point that saves
> > > > context using the Smrnmi MN* CSRs and returns via mnret.
> > > > - _trap_handler: the regular M-mode trap entry since RNME is taken
> > > > as a regular M-mode trap with NMIE=0.
> > > >
> > > > When Smrnmi is present, install the platform's NMI vectors via the new
> > > > callback, initialize MNSCRATCH with the per-hart scratch pointer, and
> > > > set MNSTATUS.NMIE.
> > > >
> > > > Smrnmi-enabled platforms must register smrnmi_handlers_init; if the
> > > > extension is detected but no callback is registered, sbi_panic() is
> > > > called since enabling NMIs without handlers in place would route
> > > > subsequent traps into nowhere.
> > > >
> > > > Signed-off-by: Evgeny Voevodin <evvoevod at tenstorrent.com>
> > >
> > > LGTM.
> > >
> > > Reviewed-by: Anup Patel <anup at brainfault.org>
> > >
> > > Regards,
> > > Anup
> > >
> > > > ---
> > > > include/sbi/sbi_platform.h | 4 ++++
> > > > lib/sbi/sbi_hart.c | 20 ++++++++++++++++++++
> > > > 2 files changed, 24 insertions(+)
> > > >
> > > > diff --git a/include/sbi/sbi_platform.h b/include/sbi/sbi_platform.h
> > > > index 715df499..fe382b56 100644
> > > > --- a/include/sbi/sbi_platform.h
> > > > +++ b/include/sbi/sbi_platform.h
> > > > @@ -150,6 +150,10 @@ struct sbi_platform_operations {
> > > > /** platform specific pmp disable on current HART */
> > > > void (*pmp_disable)(unsigned int n);
> > > >
> > > > + /** platform specific Smrnmi handlers init on current HART */
> > > > + void (*smrnmi_handlers_init)(void (*rnmi_handler)(void),
> > > > + void (*rnme_handler)(void));
> > > > +
> > > > /** platform specific Smrnmi NMI handler.
> > > > * Returns SBI_SUCCESS on success, error code if NMI cannot be handled. */
> > > > int (*rnmi_handler)(struct sbi_trap_context *tcntx);
> > > > diff --git a/lib/sbi/sbi_hart.c b/lib/sbi/sbi_hart.c
> > > > index 781161e5..92c602aa 100644
> > > > --- a/lib/sbi/sbi_hart.c
> > > > +++ b/lib/sbi/sbi_hart.c
> > > > @@ -532,6 +532,26 @@ static int hart_detect_features(struct sbi_scratch *scratch)
> > > > if (rc)
> > > > return rc;
> > > >
> > > > + if (sbi_hart_has_extension(scratch, SBI_HART_EXT_SMRNMI)) {
> > > > + const struct sbi_platform *plat = sbi_platform_thishart_ptr();
> > > > + const struct sbi_platform_operations *ops = sbi_platform_ops(plat);
> > > > + extern void _trap_rnmi_handler(void);
> > > > + extern void _trap_handler(void);
> > > > +
> > > > + if (!ops || !ops->smrnmi_handlers_init)
> > > > + sbi_panic("Smrnmi detected, but platform lacks smrnmi_handlers_init callback\n");
> > > > +
> > > > + /* Reuse _trap_handler for the RNME slot since RNME is taken
> > > > + * as a regular M-mode trap with NMIE=0. */
> > > > + ops->smrnmi_handlers_init(_trap_rnmi_handler, _trap_handler);
> > > > +
> > > > + /* Initialize MNSCRATCH for the RNMI handler */
> > > > + csr_write(CSR_MNSCRATCH, scratch);
> > > > +
> > > > + /* Enable NMIs */
> > > > + csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
> > Two concerns about this block
> >
> > First, sbi_panic() when smrnmi_handlers_init is NULL is too strict.
> > Not all platforms with Smrnmi need to program a vendor-specific NMI
> > vector register.
> > Some implementations have a fixed NMI vector address or route NMIs
> > through mtvec by default; for these, setting MNSCRATCH and NMIE is
> > sufficient and no platform callback is needed.
> >
> > This can be reproduced with QEMU virt:
> >
> > qemu-system-riscv64 -M virt -cpu rv64,smrnmi=on -m 256m \
> > -bios fw_jump.bin -nographic -no-reboot
> >
> > The firmware hangs with zero output. The panic fires before
> > sbi_platform_early_init() so the console is not yet initialized and
> > the message is silently dropped.
> >
> > Adding a no-op stub to the generic platform confirms the panic is the
> > sole cause -- boot proceeds normally with the stub in place.
> >
> > Second, the callback returns void, so there is no way to detect
> > whether the platform succeeded in programming the NMI vector.
> > csr_set(MNSTATUS, NMIE) then runs unconditionally.
> > If the callback fails silently, NMIs are enabled but the hardware
> > vector register is left at its reset value. The first NMI would jump
> > to an undefined address.
> >
> > Suggested fix:
> >
> > /* allow NULL: platforms with fixed or mtvec-based NMI vectors
> > * do not need to program any vendor register */
> > if (ops && ops->smrnmi_handlers_init) {
> > int ret = ops->smrnmi_handlers_init(_trap_rnmi_handler,
> > _trap_handler);
> > if (ret)
> > return ret;
> > }
> >
> > csr_write(CSR_MNSCRATCH, scratch);
> > csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
> >
> > This requires changing the callback signature from void to int.
> >
> > Platforms that have nothing to program return 0, platforms that fail
> > return an error, and platforms with no callback at all are handled
> > gracefully without a panic.
> >
>
> Please send a patch with appropriate Fixes tag.
Will do.
While preparing the patch, I also noticed the series is missing
non-retentive suspend/resume handling for CSR_MNSCRATCH and
MNSTATUS.NMIE.
After non-retentive suspend, the hart is fully powered off and all CSR
state is reset.
On resume, sbi_hsm_hart_resume_finish() calls
__sbi_hsm_suspend_non_ret_restore() and jumps directly to the kernel
without going through sbi_hart_init(), so the Smrnmi CSRs set up in
hart_detect_features() are never re-initialized.
I will send both fixes together as a two-patch series after testing.
>
> Regards,
> Anup
More information about the opensbi
mailing list