[PATCH v3 6/6] lib: sbi: hart: Detect and enable Smrnmi before trap-based feature detection

Nylon Chen nylon.chen at sifive.com
Tue May 12 01:25:08 PDT 2026


Anup Patel <anup at brainfault.org> 於 2026年5月12日週二 下午3:59寫道:
>
> On Tue, May 12, 2026 at 12:14 PM Nylon Chen <nylon.chen at sifive.com> wrote:
> >
> > Anup Patel <anup at brainfault.org> 於 2026年5月9日週六 下午2:51寫道:
> >
> > >
> > > On Thu, May 7, 2026 at 11:38 PM Evgeny Voevodin
> > > <evvoevod at tenstorrent.com> wrote:
> > > >
> > > > The location of the RNMI/E trap vectors in the Smrnmi extension is
> > > > implementation-defined, so platforms with vendor-specific NMI vector
> > > > mechanisms must install the firmware's NMI entry points themselves.
> > > >
> > > > Add an smrnmi_handlers_init() callback to sbi_platform_operations that
> > > > receives the firmware entry points and lets platform code install them
> > > > at the hardware-specific vector locations. Two pointers are passed:
> > > >
> > > >   - _trap_rnmi_handler: the dedicated RNMI entry point that saves
> > > >     context using the Smrnmi MN* CSRs and returns via mnret.
> > > >   - _trap_handler: the regular M-mode trap entry since RNME is taken
> > > >     as a regular M-mode trap with NMIE=0.
> > > >
> > > > When Smrnmi is present, install the platform's NMI vectors via the new
> > > > callback, initialize MNSCRATCH with the per-hart scratch pointer, and
> > > > set MNSTATUS.NMIE.
> > > >
> > > > Smrnmi-enabled platforms must register smrnmi_handlers_init; if the
> > > > extension is detected but no callback is registered, sbi_panic() is
> > > > called since enabling NMIs without handlers in place would route
> > > > subsequent traps into nowhere.
> > > >
> > > > Signed-off-by: Evgeny Voevodin <evvoevod at tenstorrent.com>
> > >
> > > LGTM.
> > >
> > > Reviewed-by: Anup Patel <anup at brainfault.org>
> > >
> > > Regards,
> > > Anup
> > >
> > > > ---
> > > >  include/sbi/sbi_platform.h |  4 ++++
> > > >  lib/sbi/sbi_hart.c         | 20 ++++++++++++++++++++
> > > >  2 files changed, 24 insertions(+)
> > > >
> > > > diff --git a/include/sbi/sbi_platform.h b/include/sbi/sbi_platform.h
> > > > index 715df499..fe382b56 100644
> > > > --- a/include/sbi/sbi_platform.h
> > > > +++ b/include/sbi/sbi_platform.h
> > > > @@ -150,6 +150,10 @@ struct sbi_platform_operations {
> > > >         /** platform specific pmp disable on current HART */
> > > >         void (*pmp_disable)(unsigned int n);
> > > >
> > > > +       /** platform specific Smrnmi handlers init on current HART */
> > > > +       void (*smrnmi_handlers_init)(void (*rnmi_handler)(void),
> > > > +                       void (*rnme_handler)(void));
> > > > +
> > > >         /** platform specific Smrnmi NMI handler.
> > > >          *  Returns SBI_SUCCESS on success, error code if NMI cannot be handled. */
> > > >         int (*rnmi_handler)(struct sbi_trap_context *tcntx);
> > > > diff --git a/lib/sbi/sbi_hart.c b/lib/sbi/sbi_hart.c
> > > > index 781161e5..92c602aa 100644
> > > > --- a/lib/sbi/sbi_hart.c
> > > > +++ b/lib/sbi/sbi_hart.c
> > > > @@ -532,6 +532,26 @@ static int hart_detect_features(struct sbi_scratch *scratch)
> > > >         if (rc)
> > > >                 return rc;
> > > >
> > > > +       if (sbi_hart_has_extension(scratch, SBI_HART_EXT_SMRNMI)) {
> > > > +               const struct sbi_platform *plat = sbi_platform_thishart_ptr();
> > > > +               const struct sbi_platform_operations *ops = sbi_platform_ops(plat);
> > > > +               extern void _trap_rnmi_handler(void);
> > > > +               extern void _trap_handler(void);
> > > > +
> > > > +               if (!ops || !ops->smrnmi_handlers_init)
> > > > +                       sbi_panic("Smrnmi detected, but platform lacks smrnmi_handlers_init callback\n");
> > > > +
> > > > +               /* Reuse _trap_handler for the RNME slot since RNME is taken
> > > > +                * as a regular M-mode trap with NMIE=0. */
> > > > +               ops->smrnmi_handlers_init(_trap_rnmi_handler, _trap_handler);
> > > > +
> > > > +               /* Initialize MNSCRATCH for the RNMI handler */
> > > > +               csr_write(CSR_MNSCRATCH, scratch);
> > > > +
> > > > +               /* Enable NMIs */
> > > > +               csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
> > Two concerns about this block
> >
> > First, sbi_panic() when smrnmi_handlers_init is NULL is too strict.
> > Not all platforms with Smrnmi need to program a vendor-specific NMI
> > vector register.
> > Some implementations have a fixed NMI vector address or route NMIs
> > through mtvec by default; for these, setting MNSCRATCH and NMIE is
> > sufficient and no platform callback is needed.
> >
> > This can be reproduced with QEMU virt:
> >
> > qemu-system-riscv64 -M virt -cpu rv64,smrnmi=on -m 256m \
> >       -bios fw_jump.bin -nographic -no-reboot
> >
> > The firmware hangs with zero output. The panic fires before
> > sbi_platform_early_init() so the console is not yet initialized and
> > the message is silently dropped.
> >
> > Adding a no-op stub to the generic platform confirms the panic is  the
> > sole cause -- boot proceeds normally with the stub in place.
> >
> > Second, the callback returns void, so there is no way to detect
> > whether the platform succeeded in programming the NMI vector.
> > csr_set(MNSTATUS, NMIE) then runs unconditionally.
> > If the callback fails silently, NMIs are enabled but the hardware
> > vector register is left at its reset value. The first NMI would jump
> > to an undefined address.
> >
> > Suggested fix:
> >
> > /* allow NULL: platforms with fixed or mtvec-based NMI vectors
> > * do not need to program any vendor register */
> > if (ops && ops->smrnmi_handlers_init) {
> >         int ret = ops->smrnmi_handlers_init(_trap_rnmi_handler,
> >                                             _trap_handler);
> >         if (ret)
> >             return ret;
> > }
> >
> > csr_write(CSR_MNSCRATCH, scratch);
> > csr_set(CSR_MNSTATUS, MNSTATUS_NMIE);
> >
> > This requires changing the callback signature from void to int.
> >
> > Platforms that have nothing to program return 0, platforms that fail
> > return an error, and platforms with no callback at all are handled
> > gracefully without a panic.
> >
>
> Please send a patch with appropriate Fixes tag.
Will do.

While preparing the patch, I also noticed the series is missing
non-retentive suspend/resume handling for CSR_MNSCRATCH and
  MNSTATUS.NMIE.

After non-retentive suspend, the hart is fully powered off and all CSR
state is reset.

On resume, sbi_hsm_hart_resume_finish() calls
__sbi_hsm_suspend_non_ret_restore() and jumps directly to the kernel
without going through sbi_hart_init(), so the Smrnmi CSRs set up in
hart_detect_features() are never re-initialized.

I will send both fixes together as a two-patch series after testing.

>
> Regards,
> Anup



More information about the opensbi mailing list