[PATCH] firmware: arm_scmi: Fix raw mode async completion race
Artem Shimko
a.shimko.dev at gmail.com
Fri Dec 19 03:46:16 PST 2025
A race condition exists in scmi_xfer_raw_worker() where async_done
completion can be accessed after being nullified by
scmi_xfer_raw_waiter_put() in scmi_handle_response(). This happens
because the worker skips wait_for_completion_timeout() when ret has
an error value and goes directly to nullify the async_done.
Fix by waiting for async_done unconditionally when it exists, without
checking the result of the ret variable. This ensures the worker
always synchronizes properly with the scmi_handle_response()'s complete
for async_done.
Fixes: 3c3d818a9317a ("firmware: arm_scmi: Add core raw transmission support")
Signed-off-by: Artem Shimko <a.shimko.dev at gmail.com>
---
Hello maintainers and reviewers,
This patch fixes a race condition in the SCMI raw mode implementation
that can lead to kernel crashes when handling asynchronous delayed
responses.
I temporarily added the trace_printk("%s\n", __func__) to track the problem.
# mount -t tracefs nodev /sys/kernel/tracing
# mount -t debugfs debugfs /sys/kernel/debug
# cd /sys/kernel/debug/tracing
# echo 1 > options/trace_printk
# echo 1 > tracing_on
Doing that until Oops
# echo -e -n 'sorry, but NDA raw msg' > /sys/kernel/debug/scmi/0/raw/message_async
Without the changes:
[ 19.513750] Unable to handle kernel NULL pointer dereference at virtual address NDA
[ 19.524756] Oops [#1]
[ 19.527034] Modules linked in:
[ 19.530097] CPU: 0 UID: 0 PID: 124 Comm: irq/12-1e200000 Not tainted 6.12.0-NDA
...
[ 19.638262] [<ffffffff8007af3c>] do_raw_spin_lock+0xa/0x10a
[ 19.643843] [<ffffffff80b4e42e>] _raw_spin_lock_irqsave+0x20/0x2c
[ 19.649941] [<ffffffff80072dc4>] complete+0x1e/0x76
[ 19.654826] [<ffffffff808056bc>] scmi_rx_callback+0x66e/0x7cc
[ 19.660589] [<ffffffff80800204>] transport_rx_callback+0x4e/0x56
[ 19.666534] [<ffffffff8082f664>] mbox_chan_received_data+0x10/0x1a
[ 19.672730] [<ffffffff80830624>] transport_chan_do_rx+0xea/0x136
[ 19.678311] [<ffffffff808306c6>] transport_mbox_threaded_isr+0x42/0x9c
[ 19.684408] [<ffffffff8008e8ee>] irq_thread_fn+0x1a/0x5a
[ 19.689733] [<ffffffff8008e724>] irq_thread+0x16c/0x20a
[ 19.694962] [<ffffffff80044032>] kthread+0xda/0xf6
[ 19.699761] [<ffffffff80b4f8d6>] ret_from_fork+0xe/0x18
[ 19.712403] ---[ end trace 0000000000000000 ]---
With the changes: we dont have the Oops
# echo 0 > tracing_on
# cat trace
Without the changes:
irq/12-1e200000-120 [000] ..... 23.368262: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [003] ..... 23.394836: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 25.625926: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [002] ..... 25.653884: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 27.202031: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [001] ..... 27.228216: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 28.504546: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [002] ..... 28.531534: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 30.102729: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 30.129688: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 31.108407: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 31.136012: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 32.388953: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 32.415700: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 33.737014: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [002] ..... 33.764977: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 34.979096: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [002] ..... 35.005377: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
kworker/u45:0-95 [003] ..... 36.758043: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put <- RC
irq/12-1e200000-119 [000] ..... 37.561734: scmi_rx_callback: scmi_handle_response
Withthe changes:
irq/12-1e200000-120 [000] ..... 23.368262: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [003] ..... 23.394836: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 25.625926: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [002] ..... 25.653884: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 27.202031: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [001] ..... 27.228216: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 28.504546: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [002] ..... 28.531534: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 30.102729: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 30.129688: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 31.108407: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 31.136012: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 32.388953: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 32.415700: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 33.737014: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [002] ..... 33.764977: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 34.979096: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [002] ..... 35.005377: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 36.028136: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [003] ..... 36.055553: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 36.906953: scmi_rx_callback: scmi_handle_response
kworker/u45:0-95 [002] ..... 36.933304: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 37.548706: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [003] ..... 37.577260: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 38.079993: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [002] ..... 38.108648: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
irq/12-1e200000-120 [000] ..... 38.512822: scmi_rx_callback: scmi_handle_response
kworker/u44:0-94 [002] ..... 38.540403: scmi_xfer_raw_waiter_put: scmi_xfer_raw_waiter_put
...
--
Regards,
Artem
drivers/firmware/arm_scmi/raw_mode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/arm_scmi/raw_mode.c b/drivers/firmware/arm_scmi/raw_mode.c
index 73db5492ab44..362773f3114d 100644
--- a/drivers/firmware/arm_scmi/raw_mode.c
+++ b/drivers/firmware/arm_scmi/raw_mode.c
@@ -479,7 +479,7 @@ static void scmi_xfer_raw_worker(struct work_struct *work)
ret, scmi_inflight_count(raw->handle));
/* Wait also for an async delayed response if needed */
- if (!ret && xfer->async_done) {
+ if (xfer->async_done) {
unsigned long tmo = msecs_to_jiffies(SCMI_MAX_RESPONSE_TIMEOUT);
if (!wait_for_completion_timeout(xfer->async_done, tmo))
--
2.43.0
More information about the linux-arm-kernel
mailing list