[PATCH RESEND RFC 1/3] net: ath11k: fix redundant reset from stale pending workqueue bit

Jeff Johnson jeff.johnson at oss.qualcomm.com
Tue May 12 16:09:40 PDT 2026


On 3/30/2026 3:05 AM, Matthew Leach wrote:
> During a firmware lockup, WMI commands time out in rapid succession,
> each calling queue_work() to schedule ath11k_core_reset().  This can
> cause a spurious extra reset after recovery completes:
> 
> 1. First WMI timeout calls queue_work(), sets the pending bit and
>    schedules ath11k_core_reset(). The workqueue clears the pending bit
>    before invoking the work function. reset_count becomes 1 and the reset
>    is kicked off asynchronously. ath11k_core_reset() returns.
> 
> 2. Second WMI timeout calls queue_work() and re-queues the work. When it
>    runs after step 1 returns, it sees reset_count > 1 and blocks in
>    wait_for_completion(). The pending bit is again cleared.
> 
> 3. Third WMI timeout calls queue_work(), the pending bit was cleared in
>    step 2, so this succeeds and arms another execution.
> 
> 4. The asynchronous reset finishes. ath11k_mac_op_reconfig_complete()
>    decrements reset_count and calls complete(). The blocked worker from
>    step 2 wakes, takes the early-exit path, and decrements reset_count to
>    0.
> 
> 5. The workqueue sees the pending bit from step 3 and runs
>    ath11k_core_reset() again. reset_count is 0, triggering a
>    full redundant hardware reset.
> 
> Fix this by calling cancel_work() on reset_work in
> ath11k_mac_op_reconfig_complete() before signalling completion. This
> clears any stale pending bit, preventing the spurious re-execution.
> 
> Signed-off-by: Matthew Leach <matthew.leach at collabora.com>
> ---
>  drivers/net/wireless/ath/ath11k/mac.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
> index e4ee2ba1f669..748f779b3d1b 100644
> --- a/drivers/net/wireless/ath/ath11k/mac.c
> +++ b/drivers/net/wireless/ath/ath11k/mac.c
> @@ -9274,6 +9274,10 @@ ath11k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
>  			 * the recovery has to be done for each radio
>  			 */
>  			if (recovery_count == ab->num_radios) {
> +				/* Cancel any pending work, preventing a second redudant

nits:
1) networking no longer uses a different block comment style so use the
standard style where /* is on a line by itself
2: s/redudant/redundant/ (subject has it right)

but don't post a new version just for these -- wait for any other comments.
I'm pinging the development team to look at this thread.

> +				 * reset.
> +				 */
> +				cancel_work(&ab->reset_work);
>  				atomic_dec(&ab->reset_count);
>  				complete(&ab->reset_complete);
>  				ab->is_reset = false;
> 




More information about the ath11k mailing list