connection hangs after wpa_supplicant re-key

Kevin O'Connor kevin at koconnor.net
Tue Aug 16 12:21:22 PDT 2016


Hi,

I've found that my wifi connection always seems to hang after a few
hours unless I dramatically increase "dot11RSNAConfigPMKLifetime".
What's the best way to further debug (and hopefully find the root
cause of this issue)?  What's the down side to increasing this value?

I'm using wpa_supplicant from git (31d3692f) on an openwrt based
router (tp-link archer c7 / ath10k).  The 5ghz radio in the router is
setup as a client using WPA2-EAP / TTLS / MSCHAPv2.  I don't have
admin access to the AP, but it appears to be a "Ruckus Wireless
ZoneFlex 802.11ac wave 2 4x4 access point".

I have been able to successfully configure and connect the client to
the access point.  However, after several hours of solid connectivity,
the connection always goes into a "dead" state - the OS reports the
connection up, but all packets are lost.

I enabled debugging in wpa_supplicant (-dd -f /var/log/wpalog) and
found that every 2 hours the AP seems to initiate a re-key event (mac
addresses removed):

Aug 07 19:15:48 l2_packet_receive: src=(...mac...) len=131
Aug 07 19:15:48 wlan0: RX EAPOL from (...mac...)
Aug 07 19:15:48 EAPOL: Ignoring WPA EAPOL-Key frame in EAPOL state machines
...
Aug 07 19:15:48 wlan0: WPA: Group rekeying completed with (...mac...) [GTK=CCMP]
Aug 07 19:15:48 wlan0: Cancelling authentication timeout
Aug 07 19:15:48 wlan0: State: GROUP_HANDSHAKE -> COMPLETED

this AP initiated re-keying is always successful.  However, every 8.4
hours it seems as if the client initiates a re-key event which looks
like:

Aug 07 20:15:52 EAPOL: txStart
Aug 07 20:15:52 TX EAPOL: dst=(...mac...)
Aug 07 20:15:52 l2_packet_receive: src=(...mac...) len=9
Aug 07 20:15:52 wlan0: RX EAPOL from (...mac...)
...
Aug 07 20:15:55 wlan0: WPA: Key negotiation completed with (...mac...) [PTK=CCMP
+GTK=CCMP]
Aug 07 20:15:55 wlan0: Cancelling authentication timeout
Aug 07 20:15:55 wlan0: State: GROUP_HANDSHAKE -> COMPLETED
Aug 07 20:15:55 EAPOL: External notification - portValid=1
Aug 07 20:16:25 EAPOL: authWhile --> 0
Aug 07 20:16:55 EAPOL: idleWhile --> 0
Aug 07 20:16:55 EAPOL: disable timer tick

After this event the connection always goes into a "dead" state.  (OS
reports interface up, but no packets come through.)  This "re-key
event" seems much more intensive (the logs are much bigger - eg, it
completes a x509 certificate check), but I see no indication of any
failure or error messages.

The connection seems to stay in this dead state until the next AP
initiated re-key event (often an hour or so later) - at which time the
re-keying fails after several attempts and then the interface is reset
causing the connection to come back up.  The pattern repeats itself
every 8.4 hours.  Interestingly, after coming up the second time, the
log has lots of these messages:

...
Aug 08 05:51:53 EAPOL: EAP Session-Id not available
Aug 08 05:51:58 EAPOL: EAP Session-Id not available
Aug 08 05:52:04 EAPOL: EAP Session-Id not available
...

However, these messages don't seem to adversely impact the connection
or change the pattern above.

As above, I can work around the problem by increasing
dot11RSNAConfigPMKLifetime in the config file.  I also tried setting
"fast_reauth=0" but that did not have an impact.  With
"dot11RSNAConfigPMKLifetime=31536000" I've seen a solid connection for
multiple days.

Any ideas on how I can further debug/fix this?

Thanks,
-Kevin



More information about the Hostap mailing list