hostap-driver 0.4.7 -- scan bug ( managed mode )

Tony Espy espy
Thu Apr 6 16:06:33 PDT 2006


I think I've uncovered a bug with way the hostap-driver returns scan
results.

I'm currently running v0.4.7 of the hostap-driver in managed mode on the 
Pepper Pad which has a CF-based WiFi card ( STA fw = 1.8.4 ).  The Pad 
also uses wpa_supplicant, but from my analysis, it doesn't appear to be 
involved.

There are two potential scenarios.  The first I discovered yesterday 
when I put my Pepper Pad to sleep  ( ie. suspend mode ) and drove into 
Cambridge, MA ( in a lovely Spring snowstorm I might add ) to meet a 
friend for lunch.

When I woke my Pad, it tried to connect to one of the APs we have here
in the office!  I took a look at one of our logs and noticed that the
scan results from wpa_supplicant included all of the APs in our office!

So after getting back, I ran thru some tests and stared at the driver
code for awhile and I think I've found the problem.

The function prism2_translate_scan() in hostap_ioctl.c builds the scan
response returned to user-land.  It uses two sources to build the response:

- 'last_scan_results' - an array populated with the scan results from
the prism firmware response to a SCAN or HOSTSCAN command

- 'bss_list' - this list is built dynamically by the driver whenever it
receives a beacon or probe frame.

APs added to the bss_list are expired every so often based upon a
last_update attribute.  The problem is that the expire_bss() function is
only called by the function that parses incoming beacon or probe frames
( hostap_rx_sta_beacon() ).  If none are received, the entries in the
list don't seem to ever expire.  More on this later...

The last piece of the puzzle is how the function prism2_translate_scan()
works.

First, it walks thru the bss_list and sets the include attribute of each
bss to 0.

Next, it walks thru the last_scan_results list.  For each entry, it
checks to see if the entry is also in the bss_list.  If it is, it adds
the bss entry to the response via a call to __prism2_translate_scan() 
and sets its include flag to 1.

Finally, it walks thru the bss_list and adds any entries that have not
already been added ( ie. include == 0 ).

It appears ( based upon comments ) that this last logic was done to
circumvent the fact that the Prism firmware will only return a maximum
of 32 APs in it's scan results.

So, this bug can occur whenever a device running the hostap-driver in
managed mode has a populated bss_list and moves out of range of any 
access points ( ie. no beacons or probes frames are received ).  In our 
case, this can easily be accomplished by putting the Pepper Pad to sleep 
and waking it in an area that has no Wi-Fi traffic.

I think the proper fix is twofold:

1. On a suspend, clear the bss_list ( ie. via a new function 
hostap_clear_bss().

2. Call hostap_expire_bss() from prism2_translate_scan() - this takes 
care of the case where someone has moved outside of the range of any 
access point traffic before the entries in the list have expired.  One 
possible scenario is getting into an elevator that moves out of range of 
any Wi-Fi traffic.

I'll followup this post with the actual patch.  The one thing I'm a bit 
unsure about is whether or not I'm using the correct spinlock calls in 
prism2_suspend().

Also FYI, the patch includes a previous patch that we've been running 
with for quite some time that grabs the channel number out of local_info 
if needed.  If the AP the JOIN is for isn't found in the 
last_scan_results, then the JOIN is sent to the firmware with channel=0. 
  This was causing the firmware to behave unpredictably ( eg. the rate 
would sometimes drop to 1Mbps and never go back up and scanning would 
break association ).

Comments?

Tony Espy
Pepper Computer






More information about the Hostap mailing list