connection hangs after wpa_supplicant re-key

Jouni Malinen j at w1.fi
Mon Dec 5 11:09:31 PST 2016


On Tue, Aug 16, 2016 at 03:21:22PM -0400, Kevin O'Connor wrote:
> I've found that my wifi connection always seems to hang after a few
> hours unless I dramatically increase "dot11RSNAConfigPMKLifetime".
> What's the best way to further debug (and hopefully find the root
> cause of this issue)?  What's the down side to increasing this value?
> 
> I'm using wpa_supplicant from git (31d3692f) on an openwrt based
> router (tp-link archer c7 / ath10k).  The 5ghz radio in the router is
> setup as a client using WPA2-EAP / TTLS / MSCHAPv2.  I don't have
> admin access to the AP, but it appears to be a "Ruckus Wireless
> ZoneFlex 802.11ac wave 2 4x4 access point".

Did you happen to figure out how to resolve this issue or find any more
details on the issue?

Based on the information here, it sounds like there is some kind of
interop issue with the specific AP used here for the case where EAP
reauthentication is forced by the client. As far as I can tell, this
works correctly in wpa_supplicant and works fine with tested against
hostapd-based AP/Authenticator.

> I enabled debugging in wpa_supplicant (-dd -f /var/log/wpalog) and
> found that every 2 hours the AP seems to initiate a re-key event (mac
> addresses removed):

> Aug 07 19:15:48 wlan0: WPA: Group rekeying completed with (...mac...) [GTK=CCMP]

This is group (GTK) rekeying.

> this AP initiated re-keying is always successful.  However, every 8.4
> hours it seems as if the client initiates a re-key event which looks
> like:
> 
> Aug 07 20:15:52 EAPOL: txStart

While this one if full EAP reauthentication which is supposed to be
followed by 4-way handshake to derive a new pairwise (unicast) key PTK.

> Aug 07 20:15:55 wlan0: WPA: Key negotiation completed with (...mac...) [PTK=CCMP
> +GTK=CCMP]

And it sounds like this was completed successfully as far as
wpa_supplicant is concerned.

> After this event the connection always goes into a "dead" state.  (OS
> reports interface up, but no packets come through.)  This "re-key
> event" seems much more intensive (the logs are much bigger - eg, it
> completes a x509 certificate check), but I see no indication of any
> failure or error messages.

This is indeed much more than the AP-triggered GTK rekeying.

Not seeing packets go through after this (successful looking EAP reauth
+ 4 way handshake) would imply that there is some kind of mismatch in
the encryption keys between the AP and station. It can be a bit tricky
to debug this, though, since it would likely take having either access
to debugging on both ends or alternatively, getting a full sniffer
capture with known keys so that the encrypted frames can be decrypted to
analyze what exactly happened and whether the AP or STA is using
incorrect keys to encrypt frames after EAP reauthentication.

> The connection seems to stay in this dead state until the next AP
> initiated re-key event (often an hour or so later) - at which time the
> re-keying fails after several attempts and then the interface is reset
> causing the connection to come back up.  The pattern repeats itself
> every 8.4 hours.  Interestingly, after coming up the second time, the
> log has lots of these messages:
> 
> ...
> Aug 08 05:51:53 EAPOL: EAP Session-Id not available
> Aug 08 05:51:58 EAPOL: EAP Session-Id not available
> Aug 08 05:52:04 EAPOL: EAP Session-Id not available
> ...
> 
> However, these messages don't seem to adversely impact the connection
> or change the pattern above.

Yeah, those can be ignored since EAP Session-Id is not needed here. This
"fixing" of the issue is due to the AP initiating GTK rekeying and
failing to complete it with this STA and consequently forcing
disconnection. That failure to rekey is expected here since the data
connection was broken previously.

> As above, I can work around the problem by increasing
> dot11RSNAConfigPMKLifetime in the config file.  I also tried setting
> "fast_reauth=0" but that did not have an impact.  With
> "dot11RSNAConfigPMKLifetime=31536000" I've seen a solid connection for
> multiple days.
> 
> Any ideas on how I can further debug/fix this?

Some notes above on what this would take.. Either debug from AP or
sniffer capture and all the needed keys for analysis.

Using a larger dot11RSNAConfigPMKLifetime value sounds like a reasonable
workaround for this, though. All it does here is give the AP full
control on when to force PMK rekeying (i.e., in practice, when to force
EAP reauthentication).

-- 
Jouni Malinen                                            PGP id EFC895FA



More information about the Hostap mailing list