[PATCH v9 3/5] drivers: firmware: add riscv SSE support

Fri May 22 09:40:58 PDT 2026

On Sat, May 09, 2026 at 02:38:15PM +0000, sashiko-bot at kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 12 potential issue(s) to consider:
> - [Critical] Missing braces around `scoped_guard(cpus_read_lock)` in `sse_event_unregister` releases lock prematurely, leading to a race condition and use-after-free.
> - [High] Event registration rollback uses the wrong revert function, leaking firmware registrations and causing use-after-free.
> - [High] Missing validation of `riscv_hartid_to_cpuid` return value leads to out-of-bounds array access.
> - [High] `sse_cpu_teardown` causes out-of-bounds array access on single-CPU teardown and fails to unregister global events across reboots.
> - [High] 64-bit physical addresses are silently truncated to 32 bits on RV32.
> - [High] CPU hotplug unconditionally enables local events, ignoring driver-defined disabled states.
> - [High] Early release of `cpus_read_lock` in `sse_event_register` causes missed event registration on hotplugged CPUs.
> - [High] `sse_event_alloc_global` calls `smp_processor_id()` in a preemptible context.
> - [High] Missing synchronization in `sse_event_unregister()` leads to use-after-free with concurrently executing hardware event handlers.
> - [Medium] `sse_event_set_target_cpu_nolock` silently ignores attribute write failures.
> - [Medium] Mixing `goto` error handling with `guard()` and `scoped_guard()` declarations violates subsystem cleanup guidelines.
> - [Medium] The commit message claims clients can 'complete SSE events', but no such API exists.

FWIW, I would like to see some responses to these sashiko reports, they
look pretty numerous, so there's a good chance one of them is valid!

The 2nd medium one looks valid, can probably hit it with a DEFINE_FREE()
for sse_event_free(event) and just return?

The critical one I think deserves a comment in the code if what's there
is okay.

Cheers,
Conor.

> --
> 
> commit 1f43f4083e7681521e63a564e1ff80519c1569f5
> Author: Clément Léger <cleger at rivosinc.com>
> Subject: drivers: firmware: add riscv SSE support
> 
> This commit adds a driver-level interface for RISC-V SSE support, allowing
> Linux clients to register handlers and manage SSE events. It supports both
> local and global events and handles CPU hotplug and PM transitions.
> 
> >     Add a driver-level interface for RISC-V SSE support.  The interface lets
> >     Linux clients register handlers and enable, disable and complete SSE
> >     events.  It will be used by PMU and GHES drivers.
> 
> The commit message states the interface lets clients complete SSE events.
> However, there does not appear to be any API provided to complete an event.
> 
> Could the commit message be updated to reflect the actual implementation?
> 
> > diff --git a/drivers/firmware/riscv/riscv_sbi_sse.c b/drivers/firmware/riscv/riscv_sbi_sse.c
> > new file mode 100644
> > index 0000000000000..830011ddd33f7
> > --- /dev/null
> > +++ b/drivers/firmware/riscv/riscv_sbi_sse.c
> > @@ -0,0 +1,694 @@
> [ ... ]
> > +static int sse_event_attr_get_no_lock(struct sse_registered_event *reg_evt,
> > +				      unsigned long attr_id, unsigned long *val)
> > +{
> > +	struct sbiret sret;
> > +	u32 evt = reg_evt->event->evt_id;
> > +	unsigned long phys;
> > +
> > +	phys = sse_event_get_attr_phys(reg_evt);
> 
> Can this physical address be truncated on RV32 systems?
> 
> When CONFIG_ARCH_PHYS_ADDR_T_64BIT is enabled, physical addresses are 64 bits,
> but unsigned long is 32 bits. Will the firmware receive a corrupted 32-bit
> physical address since the ABI expects 64-bit values to be passed in two
> registers on RV32?
> 
> [ ... ]
> > +static int sse_event_attr_set_nolock(struct sse_registered_event *reg_evt,
> > +				     unsigned long attr_id, unsigned long val)
> > +{
> > +	struct sbiret sret;
> > +	u32 evt = reg_evt->event->evt_id;
> > +	unsigned long phys;
> > +
> > +	reg_evt->attr = val;
> > +	phys = sse_event_get_attr_phys(reg_evt);
> 
> Is this subject to the same RV32 physical address truncation mentioned above?
> 
> [ ... ]
> > +static int sse_event_set_target_cpu_nolock(struct sse_event *event,
> > +					   unsigned int cpu)
> > +{
> > +	unsigned long hart_id = cpuid_to_hartid_map(cpu);
> > +	struct sse_registered_event *reg_evt = event->global;
> > +	u32 evt = event->evt_id;
> > +	bool was_enabled;
> > +	int ret;
> > +
> > +	if (!sse_event_is_global(evt))
> > +		return -EINVAL;
> > +
> > +	was_enabled = reg_evt->is_enabled;
> > +	if (was_enabled)
> > +		sse_event_disable_local(event);
> > +
> > +	ret = sse_event_attr_set_nolock(reg_evt, SBI_SSE_ATTR_PREFERRED_HART,
> > +					hart_id);
> > +	if (ret == 0)
> > +		sse_global_event_update_cpu(event, cpu);
> > +
> > +	if (was_enabled)
> > +		sse_event_enable_local(event);
> > +
> > +	return 0;
> > +}
> 
> Should this function return the error code instead of unconditionally returning 0?
> 
> If the SBI call to set the preferred hart fails, the error is masked from the
> caller, leaving the kernel thinking the migration succeeded while the firmware
> state is out of sync.
> 
> [ ... ]
> > +static int sse_event_alloc_global(struct sse_event *event)
> > +{
> > +	int err;
> > +	struct sse_registered_event *reg_evt;
> > +
> > +	reg_evt = kzalloc_obj(*reg_evt, GFP_KERNEL);
> > +	if (!reg_evt)
> > +		return -ENOMEM;
> > +
> > +	event->global = reg_evt;
> > +	err = sse_event_init_registered(smp_processor_id(), reg_evt, event);
> 
> Is it safe to call smp_processor_id() here?
> 
> sse_event_alloc_global() is invoked from sse_event_register(), which only
> holds a mutex and runs with preemption enabled. Could this trigger a warning
> if CONFIG_DEBUG_PREEMPT is enabled, or allow the thread to migrate
> immediately after, yielding an invalid CPU ID?
> 
> [ ... ]
> > +struct sse_event *sse_event_register(u32 evt, u32 priority,
> > +				     sse_event_handler_fn *handler, void *arg)
> > +{
> > +	struct sse_event *event;
> > +	int cpu;
> > +	int ret = 0;
> > +
> > +	if (!sse_available)
> > +		return ERR_PTR(-EOPNOTSUPP);
> > +
> > +	guard(mutex)(&sse_mutex);
> 
> This isn't a bug, but is it intended to mix goto-based error handling with
> scope-based guard declarations in the same function?
> 
> The cleanup subsystem documentation recommends against this pattern as it can
> create confusing ownership semantics.
> 
> > +	if (sse_event_get(evt))
> > +		return ERR_PTR(-EEXIST);
> > +
> > +	event = sse_event_alloc(evt, priority, handler, arg);
> > +	if (IS_ERR(event))
> > +		return event;
> > +
> > +	scoped_guard(cpus_read_lock) {
> > +		if (sse_event_is_global(evt)) {
> > +			unsigned long preferred_hart;
> > +
> > +			ret = sse_event_attr_get_no_lock(event->global,
> > +							 SBI_SSE_ATTR_PREFERRED_HART,
> > +							 &preferred_hart);
> > +			if (ret)
> > +				goto err_event_free;
> > +
> > +			cpu = riscv_hartid_to_cpuid(preferred_hart);
> 
> Can riscv_hartid_to_cpuid() return -ENOENT here?
> 
> If the firmware returns an unknown or unmapped hart ID, it appears the
> negative value is passed to sse_global_event_update_cpu(), which uses it to
> index the logical hart array via cpuid_to_hartid_map(), potentially causing
> an out-of-bounds memory read.
> 
> > +			sse_global_event_update_cpu(event, cpu);
> > +
> > +			ret = sse_sbi_register_event(event, event->global);
> > +			if (ret)
> > +				goto err_event_free;
> > +
> > +		} else {
> > +			ret = sse_on_each_cpu(event, SBI_SSE_EVENT_REGISTER,
> > +					      SBI_SSE_EVENT_DISABLE);
> 
> Should the revert function be SBI_SSE_EVENT_UNREGISTER instead of
> SBI_SSE_EVENT_DISABLE?
> 
> Disabling an event merely masks it without unregistering it from the firmware.
> If registration fails on any CPU, does this leave the firmware holding an
> active registration pointing to freed memory after the error path frees the
> event structures?
> 
> > +			if (ret)
> > +				goto err_event_free;
> > +		}
> > +	}
> > +
> > +	scoped_guard(spinlock, &events_list_lock)
> > +		list_add(&event->list, &events);
> 
> Because the cpus_read_lock is released before acquiring events_list_lock to
> add the event to the global list, could a CPU hotplug operation occur in
> this window?
> 
> If so, sse_cpu_online() would iterate over the list and miss the new event,
> resulting in the event not being registered on the new CPU.
> 
> > +
> > +	return event;
> > +
> > +err_event_free:
> > +	sse_event_free(event);
> > +
> > +	return ERR_PTR(ret);
> > +}
> [ ... ]
> > +void sse_event_unregister(struct sse_event *event)
> > +{
> > +	scoped_guard(mutex, &sse_mutex) {
> > +		scoped_guard(cpus_read_lock)
> > +			sse_event_unregister_nolock(event);
> 
> Is there a missing compound block around the cpus_read_lock guard?
> 
> Without braces, the lock is dropped before acquiring events_list_lock. Could a
> concurrent CPU hotplug operation acquire the list lock, find the partially
> unregistered event still in the list, and erroneously register and enable
> it in the firmware for the new CPU?
> 
> > +
> > +		scoped_guard(spinlock, &events_list_lock)
> > +			list_del(&event->list);
> > +
> > +		sse_event_free(event);
> 
> Since SBI events are delivered as NMI-like notifications, could an event have
> just fired and its handler be actively executing on another CPU?
> 
> Without synchronization like waiting for active handlers or RCU grace periods,
> it seems sse_handle_event() might dereference evt->handler and
> evt->handler_arg after the memory has been freed.
> 
> > +	}
> > +}
> [ ... ]
> > +static int sse_cpu_online(unsigned int cpu)
> > +{
> > +	struct sse_event *event;
> > +
> > +	scoped_guard(spinlock, &events_list_lock) {
> > +		list_for_each_entry(event, &events, list) {
> > +			if (sse_event_is_global(event->evt_id))
> > +				continue;
> > +
> > +			sse_event_register_local(event);
> > +			if (sse_get_reg_evt(event))
> > +				sse_event_enable_local(event);
> 
> Should this check reg_evt->is_enabled before unconditionally enabling the event?
> 
> If a driver registers an event but intentionally leaves it disabled, it looks
> like it will unexpectedly become enabled on the newly hotplugged CPU.
> 
> > +		}
> > +	}
> > +
> > +	/* Ready to handle events. Unmask SSE. */
> > +	return sse_events_unmask();
> > +}
> > +
> > +static int sse_cpu_teardown(unsigned int cpu)
> > +{
> > +	int ret = 0;
> > +	unsigned int next_cpu;
> > +	struct sse_event *event;
> > +	struct sse_registered_event *reg_evt;
> > +
> > +	/* Mask the sse events */
> > +	ret = sse_events_mask();
> > +	if (ret)
> > +		return ret;
> > +
> > +	scoped_guard(spinlock, &events_list_lock) {
> > +		list_for_each_entry(event, &events, list) {
> > +			/* Disable local event on current cpu */
> > +			if (!sse_event_is_global(event->evt_id)) {
> > +				reg_evt = sse_get_reg_evt(event);
> > +				if (reg_evt->is_enabled)
> > +					sse_event_disable_local(event);
> > +
> > +				sse_sbi_unregister_event(event);
> > +				continue;
> > +			}
> > +
> > +			if (event->cpu != smp_processor_id())
> > +				continue;
> > +
> > +			/* Update destination hart for global event */
> > +			next_cpu = cpumask_any_but(cpu_online_mask, cpu);
> > +			ret = sse_event_set_target_cpu_nolock(event, next_cpu);
> 
> What happens if there are no other CPUs online during teardown?
> 
> If cpumask_any_but() returns a value greater than or equal to nr_cpu_ids,
> passing it to sse_event_set_target_cpu_nolock() might result in an
> out-of-bounds array read in cpuid_to_hartid_map().
> 
> Furthermore, by migrating rather than unregistering global events on reboot,
> does this break kexec by leaving the event actively registered for the next
> kernel?
> 
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> 
> -- 
> Sashiko AI review · https://sashiko.dev/#/patchset/cover.1778331862.git.zhangzhanpeng.jasper@bytedance.com?part=3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20260522/e8725c36/attachment.sig>