[PATCH v4 5/7] firmware: samsung: acpm: Fix false timeouts in polling path

Tudor Ambarus tudor.ambarus at linaro.org
Tue May 5 05:14:42 PDT 2026


Hi all,

Before sending out the v5 of this series, I wanted to share what Sashiko
found when reviewing this patch, I think they are great examples of
tricky concurrency bugs. I'll also give some hints on how I'll be fixing
the problems in v5.

> @@ -261,6 +264,12 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
>  			if (rx_seqnum == tx_seqnum) {
>  				__ioread32_copy(xfer->rxd, addr, xfer->rxcnt);
>  				rx_set = true;
> +				/*
> +				 * Signal completion to the polling thread.
> +				 * Pairs with smp_load_acquire() in polling
> +				 * loop.
> +				 */
> +				smp_store_release(&rx_data->completed, true);
>  				clear_bit(seqnum, achan->bitmap_seqnum);

Sashiko pointed out that calling clear_bit() immediately after setting
completed = true creates a fatal Use-After-Free window.

If Thread A natively finds its message, it copies the payload, sets
completed = true, and clears the sequence bit. If Thread A is preempted
right before evaluating smp_load_acquire() in the polling loop, a
concurrent Thread C can claim that exact same sequence number, overwrite
the buffer, and reset completed = false. When Thread A finally resumes,
it reads false, misses its completion signal, and eventually times out.

>  			} else {
>  				/*
> @@ -271,8 +280,19 @@ static int acpm_get_rx(struct acpm_chan *achan, const struct acpm_xfer *xfer)
>  				 */
>  				__ioread32_copy(rx_data->cmd, addr,
>  						rx_data->rxcnt);
> +				/*
> +				 * Signal completion to the polling thread.
> +				 * Pairs with smp_load_acquire() in polling
> +				 * loop.
> +				 */
> +				smp_store_release(&rx_data->completed, true);

The second issue occurs when dealing with cross-thread responses. If
Thread B drains the queue and finds Thread A's response, it saves it to
the cache and sets completed = true. It intentionally skips calling
clear_bit(), expecting Thread A to do it.

However, if Thread A calls acpm_get_rx() (finding nothing) and is
preempted right before evaluating smp_load_acquire(), Thread B can
step in, cache the data, and set the completed flag. When Thread A
resumes, it sees completed == true and immediately returns 0 (success).

Because Thread A returns immediately, it completely bypasses calling
acpm_get_saved_rx(). The result: the sequence bit and the response are
permanently leaked.

Both of these bugs stem from a lack of strict ownership over the
sequence slot. To fix this in v5, I will be rewriting the polling loop
to enforce the following invariant: Only the polling thread that
allocated the TX sequence number is allowed to free it, and only right
before it exits the polling loop. Specifically:
- clear_bit() will be completely stripped out of the RX draining paths
  (acpm_get_rx and acpm_get_saved_rx).
- acpm_get_rx() will now return a native_match boolean, which evaluates
  to true only if the thread natively processed its own sequence number
  during the call
- the polling loop will use native_match to determine if it must pull
  data from the cache.
- the polling loop will execute clear_bit() on its own tx_seqnum only
  after the payload is safely in the caller's buffer.

Cheers,
ta

>  			}
>  		} else {
> +			/*
> +			 * Signal completion to the polling thread.
> +			 * Pairs with smp_load_acquire() in polling loop.
> +			 */
> +			smp_store_release(&rx_data->completed, true);
>  			clear_bit(seqnum, achan->bitmap_seqnum);
>  		}
>  
> @@ -318,7 +338,13 @@ static int acpm_dequeue_by_polling(struct acpm_chan *achan,
>  		if (ret)
>  			return ret;
>  
> -		if (!test_bit(seqnum - 1, achan->bitmap_seqnum))
> +		/*
> +		 * Safely check if our specific transaction has been processed.
> +		 * smp_load_acquire prevents the CPU from speculatively
> +		 * executing subsequent instructions before the transaction is
> +		 * synchronized.
> +		 */
> +		if (smp_load_acquire(&achan->rx_data[seqnum - 1].completed))
>  			return 0;
>  
>  		/* Determined experimentally. */
> @@ -384,6 +410,7 @@ static void acpm_prepare_xfer(struct acpm_chan *achan,
>  
>  	/* Clear data for upcoming responses */
>  	rx_data = &achan->rx_data[achan->seqnum - 1];
> +	rx_data->completed = false;
>  	memset(rx_data->cmd, 0, sizeof(*rx_data->cmd) * rx_data->n_cmd);
>  	/* zero means no response expected */
>  	rx_data->rxcnt = xfer->rxcnt;
> 




More information about the linux-arm-kernel mailing list