[PATCH v2] ath10k: fix wmi mgmt tx queue full due to race condition
Kalle Valo
kvalo at codeaurora.org
Thu Jan 28 02:19:57 EST 2021
Miaoqing Pan <miaoqing at codeaurora.org> wrote:
> Failed to transmit wmi management frames:
>
> [84977.840894] ath10k_snoc a000000.wifi: wmi mgmt tx queue is full
> [84977.840913] ath10k_snoc a000000.wifi: failed to transmit packet, dropping: -28
> [84977.840924] ath10k_snoc a000000.wifi: failed to submit frame: -28
> [84977.840932] ath10k_snoc a000000.wifi: failed to transmit frame: -28
>
> This issue is caused by race condition between skb_dequeue and
> __skb_queue_tail. The queue of ‘wmi_mgmt_tx_queue’ is protected by a
> different lock: ar->data_lock vs list->lock, the result is no protection.
> So when ath10k_mgmt_over_wmi_tx_work() and ath10k_mac_tx_wmi_mgmt()
> running concurrently on different CPUs, there appear to be a rare corner
> cases when the queue length is 1,
>
> CPUx (skb_deuque) CPUy (__skb_queue_tail)
> next=list
> prev=list
> struct sk_buff *skb = skb_peek(list); WRITE_ONCE(newsk->next, next);
> WRITE_ONCE(list->qlen, list->qlen - 1);WRITE_ONCE(newsk->prev, prev);
> next = skb->next; WRITE_ONCE(next->prev, newsk);
> prev = skb->prev; WRITE_ONCE(prev->next, newsk);
> skb->next = skb->prev = NULL; list->qlen++;
> WRITE_ONCE(next->prev, prev);
> WRITE_ONCE(prev->next, next);
>
> If the instruction ‘next = skb->next’ is executed before
> ‘WRITE_ONCE(prev->next, newsk)’, newsk will be lost, as CPUx get the
> old ‘next’ pointer, but the length is still added by one. The final
> result is the length of the queue will reach the maximum value but
> the queue is empty.
>
> So remove ar->data_lock, and use 'skb_queue_tail' instead of
> '__skb_queue_tail' to prevent the potential race condition. Also switch
> to use skb_queue_len_lockless, in case we queue a few SKBs simultaneously.
>
> Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1.c2-00033-QCAHLSWMTPLZ-1
>
> Signed-off-by: Miaoqing Pan <miaoqing at codeaurora.org>
> Reviewed-by: Brian Norris <briannorris at chromium.org>
> Signed-off-by: Kalle Valo <kvalo at codeaurora.org>
Patch applied to ath-next branch of ath.git, thanks.
b55379e343a3 ath10k: fix wmi mgmt tx queue full due to race condition
--
https://patchwork.kernel.org/project/linux-wireless/patch/1608618887-8857-1-git-send-email-miaoqing@codeaurora.org/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
More information about the ath10k
mailing list