[question, bug] regularly disconnecting wifi on RB1 and RB2 boards, ath10
Loic Poulain
loic.poulain at oss.qualcomm.com
Wed Jul 23 03:42:37 PDT 2025
Hi Alexey,
On Tue, Jul 22, 2025 at 5:53 PM Loic Poulain
<loic.poulain at oss.qualcomm.com> wrote:
>
> On Fri, Jun 27, 2025 at 1:09 AM Bryan O'Donoghue
> <bryan.odonoghue at linaro.org> wrote:
> >
> > On 26/06/2025 15:48, Alexey Klimov wrote:
> > > Hi all,
> > >
> > > After a long time of testing it seems the problem narrows down to qrb2210 rb1
> > > and qrb4210 rb2 boards.
> > >
> > > After booting, the board connects to the wifi network and after around ~5-10
> > > minutes it loses the connection (nothing in dmesg). A simple ping of another
> > > machine on the local network doesn't work. After, I guess, around 5000
> > > seconds the GROUP_KEY_HANDSHAKE_TIMEOUT message is printked:
> > >
> > > [ 5064.093748] wlan0: deauthenticated from 8c:58:72:d4:d1:8d (Reason: 16=GROUP_KEY_HANDSHAKE_TIMEOUT)
> > > [ 5067.083790] wlan0: authenticate with 8c:58:72:d4:d1:8d (local address=82:95:77:b1:05:a5)
> > > [ 5067.091971] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
> > > [ 5067.100192] wlan0: authenticated
> > > [ 5067.104734] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
> > > [ 5067.113230] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 aid=2)
> > > [ 5067.193624] wlan0: associated
> > >
> > > and after that wireless connection works for ~5-10 minutes and then the cycle
> > > repeats. The longer log with more info and some info with firmware versions,
> > > ids, etc is at the end of this email [1]. Simple wlan0 down and wlan0 up fixes
> > > things for a few minutes.
> > >
> > > iw wlan0 link reports the following when wireless network is working:
> > >
> > > root at rb1:~# iw wlan0 link
> > > Connected to 8c:58:72:d4:d1:8d (on wlan0)
> > > SSID: void
> > > freq: 5300
> > > RX: 45802 bytes (424 packets)
> > > TX: 71260 bytes (125 packets)
> > > signal: -66 dBm
> > > rx bitrate: 433.3 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 1
> > >
> > > bss flags: short-slot-time
> > > dtim period: 1
> > > beacon int: 100
> > >
> > > and this when wireless connection doesn't work:
> > >
> > > Connected to 8c:58:72:d4:d1:8d (on wlan0)
> > > SSID: void
> > > freq: 5300
> > > RX: 850615 bytes (9623 packets)
> > > TX: 20372 bytes (247 packets)
> > > signal: -61 dBm
> > > rx bitrate: 6.0 MBit/s
> > >
> > > bss flags: short-slot-time
> > > dtim period: 1
> > > beacon int: 100
> > >
> > > This was tested with three different routers and different wifi networks.
> > > Other devices here do not exhibit this behaviour.
> > >
> > > Any hints on how to debug this? Any debug switches I can toggle to debug this?
> > > I am happy to provide more info or test changes/patches if any.
> > >
> > > Thanks in advance.
> > > Best regards,
> > > Alexey
> > >
> > > [1]:
> > >
> > > [ 7.758934] ath10k_snoc c800000.wifi: qmi chip_id 0x120 chip_family 0x4007 board_id 0xff soc_id 0x40670000
> > > [ 7.769740] ath10k_snoc c800000.wifi: qmi fw_version 0x337703a3 fw_build_timestamp 2023-10-14 01:26 fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HL.3.3.7.c2-00931-QCAHLSWMTPLZ-1
> > > [ 11.086123] ath10k_snoc c800000.wifi: wcn3990 hw1.0 target 0x00000008 chip_id 0x00000000 sub 0000:0000
> > > [ 11.095622] ath10k_snoc c800000.wifi: kconfig debug 0 debugfs 0 tracing 0 dfs 0 testmode 0
> > > [ 11.103998] ath10k_snoc c800000.wifi: firmware ver api 5 features wowlan,mgmt-tx-by-reference,non-bmi,single-chan-info-per-channel crc32 a79c5b24
> > > [ 11.144810] ath10k_snoc c800000.wifi: htt-ver 3.128 wmi-op 4 htt-op 3 cal file max-sta 32 raw 0 hwcrypto 1
> > > [ 11.230894] ath10k_snoc c800000.wifi: invalid MAC address; choosing random
> > > [ 11.238128] ath: EEPROM regdomain: 0x0
> > > [ 11.242060] ath: EEPROM indicates default country code should be used
> > > [ 11.248582] ath: doing EEPROM country->regdmn map search
> > > [ 11.253950] ath: country maps to regdmn code: 0x3a
> > > [ 11.258805] ath: Country alpha2 being used: US
> > > [ 11.263466] ath: Regpair used: 0x3a
> > > [ 15.355756] wlan0: authenticate with 8c:58:72:d4:d1:8d (local address=82:95:77:b1:05:a5)
> > > [ 15.363942] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
> > > [ 15.372142] wlan0: authenticated
> > > [ 15.377928] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
> > > [ 15.386338] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 aid=2)
> > > [ 15.466514] wlan0: associated
> > > [ 23.167251] systemd-journald[195]: Oldest entry in /var/log/journal/ec3e0078e5e0499bac67949f3edf3fcf/system.journal is older than the configured file retention duration (1month), suggesting rotation.
> > > [ 23.185186] systemd-journald[195]: /var/log/journal/ec3e0078e5e0499bac67949f3edf3fcf/system.journal: Journal header limits reached or header out-of-date, rotating.
> > > [ 31.750177] l5: disabling
> > > [ 31.753382] l11: disabling
> > > [ 31.756385] l16: disabling
> > > [ 5064.093748] wlan0: deauthenticated from 8c:58:72:d4:d1:8d (Reason: 16=GROUP_KEY_HANDSHAKE_TIMEOUT)
> >
> > So.
> >
> > I wonder what state the GTK - offload is in here.
> >
> > WMI_GTK_OFFLOAD_CMDID = WMI_CMD_GRP(WMI_GRP_GTK_OFL),
> >
> > drivers/net/wireless/ath/ath10k/wmi-tlv.c: cfg->gtk_offload_max_vdev =
> > __cpu_to_le32(2);
> >
> > Try toggling that offload off or on and see what happens.
> >
> > > [ 5067.083790] wlan0: authenticate with 8c:58:72:d4:d1:8d (local address=82:95:77:b1:05:a5)
> > > [ 5067.091971] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
> > > [ 5067.100192] wlan0: authenticated
> > > [ 5067.104734] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
> > > [ 5067.113230] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 aid=2)
> > > [ 5067.193624] wlan0: associated
> > > [10437.346541] wlan0: deauthenticated from 8c:58:72:d4:d1:8d (Reason: 16=GROUP_KEY_HANDSHAKE_TIMEOUT)
> > > [10440.340111] wlan0: authenticate with 8c:58:72:d4:d1:8d (local address=82:95:77:b1:05:a5)
> > > [10440.348408] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
> > > [10440.356698] wlan0: authenticated
> > > [10440.361077] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
> > > [10440.369516] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 aid=2)
> > > [10440.446661] wlan0: associated
> > >
> > You can put another device on your WiFi network into monitor mode and
> > sniff what is taking place.
> >
> > Kali Linux I've used in the past on an RPI for this purpose and it was
> > very easy todo.
> >
> > https://cyberlab.pacific.edu/resources/lab-network-wireless-sniffing
> >
> > Another thing to try is to do this same test on an open - unencrypted link.
> >
> > If we really suspect firmware here, lets try switching off firmware
> > offload features one-by-one, starting with GTK offload.
> >
> > ---
> > bod
> >
>
> I configured the GTK rekey interval to one minute and encountered a
> similar issue. It appears that something may be going wrong after the
> GTK rekeying process completes.
>
> The GTK update is handled entirely by wpa_supplicant (not offloaded),
> and while the new key seems to be installed correctly, with frames
> still being transmitted and received (from aircap perspective), they
> appear to be dropped or mishandled in the RX firmware path.
>
> This suggests there might be an issue with how the new keys are being
> applied or interpreted by the firmware. I’ll continue debugging to
> pinpoint the root cause.
>
> Regards,
> Loic
Could you check if this change helps:
diff --git a/drivers/net/wireless/ath/ath10k/mac.c
b/drivers/net/wireless/ath/ath10k/mac.c
index c61b95a928da..4fa7dd62aeac 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -288,8 +288,10 @@ static int ath10k_send_key(struct ath10k_vif *arvif,
key->flags |= IEEE80211_KEY_FLAG_GENERATE_IV;
if (cmd == DISABLE_KEY) {
- arg.key_cipher = ar->wmi_key_cipher[WMI_CIPHER_NONE];
- arg.key_data = NULL;
+ /* Not all hardware supports key deletion operations. so we
+ * replace the key with a junk value to invalidate it.
+ */
+ memset(arg.key_data, 0, arg.key_len);
}
Regards,
Loic
More information about the ath10k
mailing list