hostap-driver 0.4.7 -- scan bug ( managed mode )
Tony Espy
espy
Thu Apr 6 16:06:33 PDT 2006
I think I've uncovered a bug with way the hostap-driver returns scan
results.
I'm currently running v0.4.7 of the hostap-driver in managed mode on the
Pepper Pad which has a CF-based WiFi card ( STA fw = 1.8.4 ). The Pad
also uses wpa_supplicant, but from my analysis, it doesn't appear to be
involved.
There are two potential scenarios. The first I discovered yesterday
when I put my Pepper Pad to sleep ( ie. suspend mode ) and drove into
Cambridge, MA ( in a lovely Spring snowstorm I might add ) to meet a
friend for lunch.
When I woke my Pad, it tried to connect to one of the APs we have here
in the office! I took a look at one of our logs and noticed that the
scan results from wpa_supplicant included all of the APs in our office!
So after getting back, I ran thru some tests and stared at the driver
code for awhile and I think I've found the problem.
The function prism2_translate_scan() in hostap_ioctl.c builds the scan
response returned to user-land. It uses two sources to build the response:
- 'last_scan_results' - an array populated with the scan results from
the prism firmware response to a SCAN or HOSTSCAN command
- 'bss_list' - this list is built dynamically by the driver whenever it
receives a beacon or probe frame.
APs added to the bss_list are expired every so often based upon a
last_update attribute. The problem is that the expire_bss() function is
only called by the function that parses incoming beacon or probe frames
( hostap_rx_sta_beacon() ). If none are received, the entries in the
list don't seem to ever expire. More on this later...
The last piece of the puzzle is how the function prism2_translate_scan()
works.
First, it walks thru the bss_list and sets the include attribute of each
bss to 0.
Next, it walks thru the last_scan_results list. For each entry, it
checks to see if the entry is also in the bss_list. If it is, it adds
the bss entry to the response via a call to __prism2_translate_scan()
and sets its include flag to 1.
Finally, it walks thru the bss_list and adds any entries that have not
already been added ( ie. include == 0 ).
It appears ( based upon comments ) that this last logic was done to
circumvent the fact that the Prism firmware will only return a maximum
of 32 APs in it's scan results.
So, this bug can occur whenever a device running the hostap-driver in
managed mode has a populated bss_list and moves out of range of any
access points ( ie. no beacons or probes frames are received ). In our
case, this can easily be accomplished by putting the Pepper Pad to sleep
and waking it in an area that has no Wi-Fi traffic.
I think the proper fix is twofold:
1. On a suspend, clear the bss_list ( ie. via a new function
hostap_clear_bss().
2. Call hostap_expire_bss() from prism2_translate_scan() - this takes
care of the case where someone has moved outside of the range of any
access point traffic before the entries in the list have expired. One
possible scenario is getting into an elevator that moves out of range of
any Wi-Fi traffic.
I'll followup this post with the actual patch. The one thing I'm a bit
unsure about is whether or not I'm using the correct spinlock calls in
prism2_suspend().
Also FYI, the patch includes a previous patch that we've been running
with for quite some time that grabs the channel number out of local_info
if needed. If the AP the JOIN is for isn't found in the
last_scan_results, then the JOIN is sent to the firmware with channel=0.
This was causing the firmware to behave unpredictably ( eg. the rate
would sometimes drop to 1Mbps and never go back up and scanning would
break association ).
Comments?
Tony Espy
Pepper Computer
More information about the Hostap
mailing list