[SPAM] Help with finding a possible bug in wpa_supplicant

xiepsilon at openmailbox.org xiepsilon
Thu Jun 26 06:15:25 PDT 2014


I recently came across unexpected behaviour in the wpa_supplicant 
program while experimenting on my linux box.
My system runs Arch linux, and I'm fairly certain that the package is 
built from upstream sources.

I was running out of an initramfs, having passed break=postmount at the 
grub prompt to get a shell.
I then proceeded to bind mount /dev, /sys and /proc under the mountpoint 
for my root disk, and chrooted in.
Having loaded the wireless module and the relevant crypt modules, I then 
attempted to run wpa_supplicant to connect to my AP:
$ wpa_supplicant -B -i wlan0 -c /etc/wpa.conf
(wpa.conf is a simple config I created, the output of wpa_passphrase.)
All wpa_supplicant did was pop out the usual "succesfully initialised" 
message. But an attempt to obtain an IP address on wlan0 timed out on 
"waiting for carrier".

During normal operation, the wireless adapter (a USB ralink, uses the 
rt2800usb module) lights an LED when a connection is established. But in 
this case, it flickered and went out.
So I decided to omit the -B flag and run wpa_supplicant in the 
foreground to diagnose the problem. After a few lines, the message 
"wlan0: WPA: failed to set PTK to the driver" popped up.

This seemed a little strange, as I was sure I had set everything up 
correctly for wpa_supplicant to operate.

I rebooted and this time passed init=/bin/sh. To the best of my 
knowledge (after reading the initramfs script and the source for 
switch_root) this would produce an identical environment to what I had 
set up before.
Only this time, wpa_supplicant was succesful in connecting to the AP. I 
could run dhcpcd and access the outside 'net.

I had taken strace logs in both scenarios, in an attempt to isolate the 
problem. They appear roughly similar up until a particular recvmsg() 
call:

recvmsg(4, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"l\0\0\0\2\0\0\0\35\215\252S\n\1\0\0\376\377\377\377X\0\0\0\32\0\5\0\35\215\252S"..., 
16384}], msg_controllen=0, msg_flags=0}, 0) = 108
This is taken from the failing scenario, and

recvmsg(4, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"$\0\0\0\2\0\0\0\32\217\252S\263\0\0\0\0\0\0\0d\0\0\0\32\0\5\0\32\217\252S"..., 
16384}], msg_controllen=0, msg_flags=0}, 0) = 36
 From the succesful scenario.

While I expected the data in the curly braces to be subtly different in 
each case, I noted that the returned size value (recvmsg returns a 
ssize_t detailing amount of data recieved) was very different.
Also, in the failed version's strace, the message about PTK failure is 
printed immediately after that recvmsg(), whereas the message printed 
after the recvmsg in the succesful version was "wlan0: WPA: Key 
negotiation completed".
This leads me to conclude that this particular syscall is the point of 
failure.

As yet, I have been unable to determine whether this is a kernel bug or 
a wpa bug. I got way over my head trying to read the wpa_supplicant 
sources, so I would be grateful if you could help isolate the problem.

Thanks in advance,
Xi



More information about the Hostap mailing list