[PATCH ath-next v3] wifi: ath11k: fix group data packet drops during rekey
Jeff Johnson
jeff.johnson at oss.qualcomm.com
Tue Aug 5 10:51:51 PDT 2025
On 8/4/2025 12:53 AM, Rameshkumar Sundaram wrote:
> During GTK rekey, mac80211 issues a clear key (if the old key exists)
> followed by an install key operation in the same context. This causes
> ath11k to send two WMI commands in quick succession: one to clear the
> old key and another to install the new key in the same slot.
>
> Under certain conditions—especially under high load or time sensitive
> scenarios, firmware may process these commands asynchronously in a way
> that firmware assumes the key is cleared whereas hardware has a valid key.
> This inconsistency between hardware and firmware leads to group addressed
> packet drops. Only setting the same key again can restore a valid key in
> firmware and allow packets to be transmitted.
>
> This issue remained latent because the host's clear key commands were
> not effective in firmware until commit 436a4e886598 ("ath11k: clear the
> keys properly via DISABLE_KEY"). That commit enabled the host to
> explicitly clear group keys, which inadvertently exposed the race.
>
> To mitigate this, restrict group key clearing across all modes (AP, STA,
> MESH). During rekey, the new key can simply be set on top of the previous
> one, avoiding the need for a clear followed by a set.
>
> However, in AP mode specifically, permit group key clearing when no
> stations are associated. This exception supports transitions from secure
> modes (e.g., WPA2/WPA3) to open mode, during which all associated peers
> are removed and the group key is cleared as part of the transition.
>
> Add a per-BSS station counter to track the presence of stations during
> set key operations. Also add a reset_group_keys flag to track the key
> re-installation state and avoid repeated installation of the same key
> when the number of connected stations transitions to non-zero within a
> rekey period.
>
> Additionally, for AP and Mesh modes, when the first station associates,
> reinstall the same group key that was last set. This ensures that the
> firmware recovers from any race that may have occurred during a previous
> key clear when no stations were associated.
>
> This change ensures that key clearing is permitted only when no clients
> are connected, avoiding packet loss while enabling dynamic security mode
> transitions.
>
> Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.9.0.1-02146-QCAHKSWPL_SILICONZ-1
> Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
>
> Reported-by: Steffen Moser <lists at steffen-moser.de>
> Closes: https://lore.kernel.org/linux-wireless/c6366409-9928-4dd7-bf7b-ba7fcf20eabf@steffen-moser.de
> Fixes: 436a4e886598 ("ath11k: clear the keys properly via DISABLE_KEY")
> Signed-off-by: Rameshkumar Sundaram <rameshkumar.sundaram at oss.qualcomm.com>
> ---
> v3:
> - Allowed clear key strictly only for AP mode
> - Rephrased the commit text and comment in ath11k_mac_op_set_key()
> to explain that the race occures rarely under heavy load and
> discrepancy of states between firmware and hardware
> - Added newline (\n) at the end of warn log in ath11k_set_group_keys()
> - Made addr varaible as const * in ath11k_set_group_keys()
> v2:
> - Followed r-xmas style
> - Removed vdev_type check before calling ath11k_set_group_keys()
> - Removed lockdep assert in ath11k_set_group_keys()
> - Removed flags variable and passed WMI_KEY_GROUP in ath11k_set_group_keys()
> - Changed the if condition to have positive cases in ath11k_mac_op_set_key()
> - Added code comments in ath11k_mac_op_set_key() to explain how clear key and
> set key are issue back to back by mac80211 and added high level information
> about the firmware race.
> - Added comments in ath11k_mac_op_set_key() and ath11k_mac_station_add() to
> explain why a reinstall of same key is needed.
> ---
To clear up any confusion, this should have been a v3 posted internally for
review, and the history listed is from prior internal review.
So this is actually the first public version, but to avoid confusion, all
followup to the current public review should be incorporated in a public v4.
Also the follow-up should use the ath-current branch tag since this should be
incorporated into v6.17 instead of waiting for the v6.18 development cycle to
complete.
/jeff
/jeff
More information about the ath11k
mailing list