Hostapd to Wpa_supplicant 4 way Handshake Problem

Southworth, Damon Damon.Southworth
Tue Jul 29 02:05:45 PDT 2008


Hello,

I have been investigating a problem that I have been experiencing on a
WPA enterprise network and I was hoping to bounce some thoughts and
questions off other people with knowledge in this area using the same or
similar software and platforms as I. I have put a fair bit of effort
into this so far but it has got to the point where I certainly need to
throw it out to a wider audience.

The network is using the following software.

The RADUIS server is FreeRADUIS.
The AP is the Network Authenticator is using HostAPD (0.5.10).
The stations are running wpa_supplicant (0.6.3).
Both the AP and STA wireless communication is provided by Atheros
chipsets and the latest stable (0.9.4) madwifi drivers. 
The Radius server, AP and STA devices are using recent GNU/Linux O/S
Authentication protocol is PEAP (PEAPv0/EAP-MSCHAPv2)

On the WPA network in question, the Network Authenticator was configured
to force re-authentication at a fairly high frequency and it was noticed
that the STA devices were getting regular de-authentication packets over
the wireless network. The stations would come straight back up again,
re-associate and re-authenticate but the whole process would result in a
loss of communication between 5 and 10 seconds. In my scenario this
would be long enough for a higher level application to timeout and drop
a connection which is why the problem was noticed in the first place. 

I started to dig into the problem and found that these
de-authentications were happening during the 4-way handshake, in
particular during the 3/4 and 4/4 exchange where the new pair wise keys
are passed to the driver. This is a hostapd log as the problem happens:

	------ Log ------
	ath2: RADIUS Received 170 bytes from RADIUS server
	ath2: RADIUS Received RADIUS message
	ath2: STA 00:0e:8e:0d:89:9b RADIUS: Received RADIUS packet
matched with a pending request, round trip time 0.00 sec
	RADIUS packet matching with station 00:0e:8e:0d:89:9b
	ath2: STA 00:0e:8e:0d:89:9b IEEE 802.1X: old identity 'testuser'
updated with User-Name from Access-Accept 'testuser'
	ath2: STA 00:0e:8e:0d:89:9b IEEE 802.1X: decapsulated EAP packet
(code=3 id=139 len=4) from RADIUS server: EAP Success
	IEEE 802.1X: Sending EAP Packet to 00:0e:8e:0d:89:9b (identifier
139)
	madwifi_set_sta_authorized: addr=00:0e:8e:0d:89:9b authorized=1
	ath2: STA 00:0e:8e:0d:89:9b IEEE 802.1X: authorizing port
	madwifi_sta_clear_stats: addr=00:0e:8e:0d:89:9b
	ath2: STA 00:0e:8e:0d:89:9b IEEE 802.1X: authenticated - EAP
type: 25 (PEAP)
	ath2: STA 00:0e:8e:0d:89:9b WPA: sending 1/4 msg of 4-Way
Handshake
	IEEE 802.1X: 123 bytes from 00:0e:8e:0d:89:9b
	   IEEE 802.1X: version=1 type=3 length=119
	ath2: STA 00:0e:8e:0d:89:9b WPA: received EAPOL-Key frame (2/4
Pairwise)
	madwifi_get_seqnum: addr=00:00:00:00:00:00 idx=2
	ath2: STA 00:0e:8e:0d:89:9b WPA: sending 3/4 msg of 4-Way
Handshake
	ath2: STA 00:0e:8e:0d:89:9b WPA: EAPOL-Key timeout
	madwifi_get_seqnum: addr=00:00:00:00:00:00 idx=2
	ath2: STA 00:0e:8e:0d:89:9b WPA: sending 3/4 msg of 4-Way
Handshake
	ath2: STA 00:0e:8e:0d:89:9b WPA: EAPOL-Key timeout
	madwifi_get_seqnum: addr=00:00:00:00:00:00 idx=2
	ath2: STA 00:0e:8e:0d:89:9b WPA: sending 3/4 msg of 4-Way
Handshake
	ath2: STA 00:0e:8e:0d:89:9b WPA: EAPOL-Key timeout
	madwifi_get_seqnum: addr=00:00:00:00:00:00 idx=2
	ath2: STA 00:0e:8e:0d:89:9b WPA: sending 3/4 msg of 4-Way
Handshake
	madwifi_sta_deauth: addr=00:0e:8e:0d:89:9b reason_code=2
	madwifi_del_key: addr=00:0e:8e:0d:89:9b key_idx=0
	ioctl[IEEE80211_IOCTL_DELKEY]: Invalid argument
	ath2: STA 00:0e:8e:0d:89:9b IEEE 802.11: deauthenticated due to
local deauth request
	ath2: STA 00:0e:8e:0d:89:9b WPA: strict rekeying - force GTK
rekey since STA is leaving
	------ End Log ------

This shows that the AP did not receive the 4/4 packet and so went into
retry before giving up. However, on the STA side we see that it has
received 3/4 and replied with 4/4.

	------ Log ------
	Fri Jul 25 17:56:48 2008: EAP-TLV: TLV Result - Success -
EAP-TLV/Phase2 Completed
	Fri Jul 25 17:56:48 2008: RX EAPOL from 00:c0:a8:d1:25:03
	Fri Jul 25 17:56:48 2008: CTRL-EVENT-EAP-SUCCESS EAP
authentication completed successfully
	Fri Jul 25 17:56:48 2008: RX EAPOL from 00:c0:a8:d1:25:03
	Fri Jul 25 17:56:48 2008: IEEE 802.1X RX: version=2 type=3
length=95
	Fri Jul 25 17:56:48 2008: State: COMPLETED -> 4WAY_HANDSHAKE
	Fri Jul 25 17:56:48 2008: WPA: RX message 1 of 4-Way Handshake
from 00:c0:a8:d1:25:03 (ver=2)
	Fri Jul 25 17:56:48 2008: EAPOL: Successfully fetched key
(len=32)
	Fri Jul 25 17:56:48 2008: WPA: Sending EAPOL-Key 2/4
	Fri Jul 25 17:56:48 2008: RX EAPOL from 00:c0:a8:d1:25:03
	Fri Jul 25 17:56:48 2008: IEEE 802.1X RX: version=2 type=3
length=123
	Fri Jul 25 17:56:48 2008: State: 4WAY_HANDSHAKE ->
4WAY_HANDSHAKE
	Fri Jul 25 17:56:48 2008: WPA: RX message 3 of 4-Way Handshake
from 00:c0:a8:d1:25:03 (ver=2)
	Fri Jul 25 17:56:48 2008: WPA: Sending EAPOL-Key 4/4
	Fri Jul 25 17:56:48 2008: State: 4WAY_HANDSHAKE ->
GROUP_HANDSHAKE
	Fri Jul 25 17:56:51 2008: RTM_NEWLINK: operstate=1
ifi_flags=0x1043 ([UP][RUNNING])
	Fri Jul 25 17:56:51 2008: Wireless event: cmd=0x8b15 len=20
	Fri Jul 25 17:56:51 2008: Wireless event: new AP:
00:00:00:00:00:00
	Fri Jul 25 17:56:51 2008: WPA: Dissasoc
	Fri Jul 25 17:56:51 2008: CTRL-EVENT-DISCONNECTED - Disconnect
event - remove keys
	Fri Jul 25 17:56:51 2008: State: GROUP_HANDSHAKE -> DISCONNECTED
	Fri Jul 25 17:56:51 2008: State: DISCONNECTED -> SCANNING
	------ End Log ------

So, I can see the problem, but first is anyone aware if there is a
potential problem in this step?

I have looked at the code it can be seen that the new PTK is installed
straight after the 4/4 is sent, so if 4/4 is lost am I correct that in
the above situation 3/4 retries can't be received as the new keys have
been already been installed on the STA?

I noticed that the problem was exacerbated by heavy traffic on the link.
The 4/4 send is using the sendto() call in function l2_packet_send() in
module l2_packet_linux.c. As far as I know this socket call returns as
soon as the packet has been accepted by the network stack. Can it be
guaranteed that this packet is transmitted before the keys are installed
or is there a possibility for the keys to be installed before the
transmission takes place causing it to be incorrectly encrypted and not
received by the AP?

I found this problem easily reproducible using two laptops, the above
software versions and a frequent re-authentication time, say 30 seconds.
I have also experienced it using other wireless cards (Broadcom and
Prism54), so it is not exclusive to the Atheros hardware and madwifi
drivers although they were used in the testing above.

I am currently just looking for information, thoughts ideas from anyone
who has deeper knowledge in this area, knows about this handshake
process or has possibly also experienced the same problem. If this is a
known problem area have any possible fixes been suggested? If this is
not known a problem and I have picked up completely the wrong branch
here then please point me in the right direction. If it does turn out to
be a problem area I should have some resource to help.

Regards,

Damon

done

Teradyne Diagnostic Solutions Limited, Reg. No. 790061 Orion Business Park, Bird Hall Lane, Stockport, SK3 0XG, United Kingdom

Teradyne Diagnostic Solutions GmbH, Reg No. HRB 7844 Adalperostrasse 29, 85737 Ismaning, Germany

Teradyne Diagnostic Solutions Inc., EIN 48-1281865 28970 Cabot Drive, Suite 100, Novi, MI 48377, USA

Teradyne Diagnostic Solutions Belgium
Delta Business Park, Satenrozen 1 B, 2550 Kontich, Belgium




More information about the Hostap mailing list