[PATCH RESEND RFC 1/3] net: ath11k: fix redundant reset from stale pending workqueue bit
Jeff Johnson
jeff.johnson at oss.qualcomm.com
Tue May 12 16:09:40 PDT 2026
On 3/30/2026 3:05 AM, Matthew Leach wrote:
> During a firmware lockup, WMI commands time out in rapid succession,
> each calling queue_work() to schedule ath11k_core_reset(). This can
> cause a spurious extra reset after recovery completes:
>
> 1. First WMI timeout calls queue_work(), sets the pending bit and
> schedules ath11k_core_reset(). The workqueue clears the pending bit
> before invoking the work function. reset_count becomes 1 and the reset
> is kicked off asynchronously. ath11k_core_reset() returns.
>
> 2. Second WMI timeout calls queue_work() and re-queues the work. When it
> runs after step 1 returns, it sees reset_count > 1 and blocks in
> wait_for_completion(). The pending bit is again cleared.
>
> 3. Third WMI timeout calls queue_work(), the pending bit was cleared in
> step 2, so this succeeds and arms another execution.
>
> 4. The asynchronous reset finishes. ath11k_mac_op_reconfig_complete()
> decrements reset_count and calls complete(). The blocked worker from
> step 2 wakes, takes the early-exit path, and decrements reset_count to
> 0.
>
> 5. The workqueue sees the pending bit from step 3 and runs
> ath11k_core_reset() again. reset_count is 0, triggering a
> full redundant hardware reset.
>
> Fix this by calling cancel_work() on reset_work in
> ath11k_mac_op_reconfig_complete() before signalling completion. This
> clears any stale pending bit, preventing the spurious re-execution.
>
> Signed-off-by: Matthew Leach <matthew.leach at collabora.com>
> ---
> drivers/net/wireless/ath/ath11k/mac.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
> index e4ee2ba1f669..748f779b3d1b 100644
> --- a/drivers/net/wireless/ath/ath11k/mac.c
> +++ b/drivers/net/wireless/ath/ath11k/mac.c
> @@ -9274,6 +9274,10 @@ ath11k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
> * the recovery has to be done for each radio
> */
> if (recovery_count == ab->num_radios) {
> + /* Cancel any pending work, preventing a second redudant
nits:
1) networking no longer uses a different block comment style so use the
standard style where /* is on a line by itself
2: s/redudant/redundant/ (subject has it right)
but don't post a new version just for these -- wait for any other comments.
I'm pinging the development team to look at this thread.
> + * reset.
> + */
> + cancel_work(&ab->reset_work);
> atomic_dec(&ab->reset_count);
> complete(&ab->reset_complete);
> ab->is_reset = false;
>
More information about the ath11k
mailing list