problem with libertas driver, Marvell W8686, and PXA270 on 2.6.27-rc7

Dan Williams dcbw at redhat.com
Sun Oct 12 22:01:50 EDT 2008


On Mon, 2008-10-06 at 11:38 -0400, Jeff Sutherland wrote:
> On Thursday 02 October 2008, Jeff Sutherland wrote:
> > On Thursday 02 October 2008, Dan Williams wrote:
> > > On Thu, 2008-10-02 at 10:17 -0400, Jeff Sutherland wrote:
> > > > I'm seeking some insight into the structure of the command flow when
> > > > communicating with a Murata wifi module using the W8686 chip connected
> > > > via SDIO on a PXA270-based system.  On about 2 out of 3 boot ups, and
> > > > occasionally when configuring the network interface by hand after the
> > > > system is up and running, the kernel will oops from the BUG()
> > > > instruction at line 147 of drivers/net/wireless/libertas/if_sdio.c. 
> > > > When this happens, the value of priv->resp_len[resp_idx] is always
> > > > sitting at 142. It's almost as if the command processing in
> > > > drivers/net/wireless/libertas/main.c is missing a command somehow.
> > > > Curiously, this only seems to happen when the interface is
> > >
> > > Hmm, could well be.  The flow is supposed to be that the interface code
> > > puts responses into the unused slot and wakes up the main thread, and
> > > the main thread processes the response.  Could be that if a second
> > > command response comes back from the card too soon the main thread
> > > wouldn't be woken up to process it yet.
> > >
> > > Everything I've seen and read indicates that commands are serialized and
> > > thus we can only have one command outstanding at any one time.  Thus we
> > > can probably rule out fundamental problems with the implementation, but
> > > instead focus on bugs and/or quirks.
> > >
> > > Could you by any chance minimally parse the command response in
> > > if_sdio_handle_cmd() with something like:
> > >
> > > diff --git a/drivers/net/wireless/libertas/if_sdio.c
> > > b/drivers/net/wireless/libertas/if_sdio.c index b54e2ea..024a81e 100644
> > > --- a/drivers/net/wireless/libertas/if_sdio.c
> > > +++ b/drivers/net/wireless/libertas/if_sdio.c
> > > @@ -144,6 +144,18 @@ static int if_sdio_handle_cmd(struct if_sdio_card
> > > *card, spin_lock_irqsave(&priv->driver_lock, flags);
> > >
> > >         i = (priv->resp_idx == 0) ? 1 : 0;
> > > +
> > > +if (priv->resp_len[i])
> > > +{
> > > +struct cmd_header *resp = (void *) buffer;
> > > +uint16_t respcmd = le16_to_cpu(resp->command);
> > > +uint16_t result = le16_to_cpu(resp->result);
> > > +uint16_t seqnum = le16_to_cpu(resp->seqnum);
> > > +
> > > +lbs_deb_sdio("CMD_RESP: response 0x%04x, result 0x%04x, seq %d, size
> > > %d\n", +            respcmd, result, seqnum, size);
> > > +}
> > > +
> > >         BUG_ON(priv->resp_len[i]);
> > >         priv->resp_len[i] = size;
> > >         memcpy(priv->resp_buf[i], buffer, size);
> > >
> > > that coupled with turning on LBS_DEB_CMD and LBS_DEB_SDIO would be quite
> > > interesting to see the results of and should help narrow down what's
> > > going on.  I'm especially interested in the sequence number to see if if
> > > the before-BUG_ON response is the same sequence # as the BUG_ON
> > > response.
> > >
> > > Dan
> >
> > Well I've had a bit of luck here.  The driver will not BUG() with
> > LBS_DEB_CMD and LBS_DEB_SDIO enabled at the same time, but WILL bug pretty
> > regularly if these are enabled singly in turn.  The transcript is getting
> > pretty long so please see attached log file.
> 
> Continuing to pursue this (Thanks Dan for that code snippet), I've made the 
> following observations.  With libertas_debug=0x404000 the system will boot 
> and the wifi module will associate and acquire an ip address using WEP 
> encryption just fine every single time. However, with libertas_debug=0x4000 
> most of the time it will not boot.  But what's really interesting here is the 
> command sequence to the radio module changes depending on what debug levels 
> are set.  I think that's the real problem, since it appears that a command 
> 0x0016 (CMD_802_11_SNMP_MIB) following a command 0x0050 
> (CMD_802_11_ASSOCIATE) must cockup the firmware in the radio module.  In the 
> normal startup a command 0x001f (CMD_802_11_RSSI) follows command 0x0050 and 
> everything appears to work fine.  Note how command 0x0016 will time out, but 
> that the driver (provided the kernel doesn't oops) will recover and continue 
> with module initialization.  How is it that command execution order gets 
> shuffled depending on the setting of debug levels?  That seems to be the real 
> problem here.  If the data in columns below gets munged by the mailers, see 
> attached file for the same thing in (hopefully) readable form.  Data is 
> [sequence number]:[abbreviated hex command code].

Great info, looking at it.  Were you running 'iwconfig' periodically by
any chance?  Just trying to figure out what might cause the command
sequences below.  SNMP_MIB is used for a number of things, but only one
of which (set infra/adhoc mode) is in the normal association path.  The
rest (rts threshold, frag threshold, retry limits) could be triggered by
a plain 'iwconfig'.

Dan

> Command sequences on system startup:
> libertas_debug=	    libertas_debug=	    libertas_debug=0x4000
> 0x00404000	    0x4000
> Good startup:	    Oops occured:	                  Timeouts but no oops:
> 7:10		    7:10		                  7:10
> 8:28		    8:28		                  8:28
> 9:1e		    9:1e		                  9:1e
> 10:13		    10:16		                  10:16
> 11:28		    11:16		                  11:16
> 12:28		    12:13		                  12:13
> 13:2F		    13:16		                  13:16
> 14:16		    14:28		                  14:28
> 15:06		    15:28		                  15:28
> 16:16		    16:2f		                  16:2f
> 17:06		    17:06		                  17:06
> 18:16		    18:06		                  18:06
> 19:06		    19:06		                  19:06
> 20:11		    20:1e		                  20:1e
> 21:1C		    21:11		                  21:11
> 22:50 (resp: 8012)   22:16		                  22:16
> 23:1F		    23:1C		                  23:1C
> 24:0B		    24:16		                  24:16
> 25:1F		    25:50 (resp:8012)	    25:50 (resp:8012)
> 26:1E		    26:16 (cmd timed out)   26:16 (cmd timed out)
> 27:16	      (makes 3 attempts at seq. 26,   (makes 3 attempts at seq. 26,
> 28:16	       then error -110, cmd failed)       then PREP_CMD: cmd 0016 failed
> 29:16	       27:1F (resp:8016, len 142)      27:1F (resp 8016, seq 26,size142)
> 30:10	    Received CMD_RESP with invalid    Received CMD_RESP with invalid
> Sending          sequence 26 (expected 27)         sequence 26 (expected 27)
> discover	         BUG() 		                then Received response 801f, seq 27
> 					    28:0b
> 					    29:1F
> 					    30:1e
> 					    31:16
> 					    32:16
> 					    33:16
> 					    34:10
> 					    Sending discover
> 
> _______________________________________________
> libertas-dev mailing list
> libertas-dev at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/libertas-dev




More information about the libertas-dev mailing list