Reproducible issue in hacked 3.17 kernel, CT firmware

Ben Greear greearb at candelatech.com
Wed Jan 7 05:38:27 PST 2015



On 01/07/2015 01:58 AM, Michal Kazior wrote:
> On 30 December 2014 at 20:18, Ben Greear <greearb at candelatech.com> wrote:
>> yeah, so maybe not reproducible upstream, but anyway...
>>
>> My test case is to re-associate 4 stations over and over again, with
>> a scan and a 5 second sleep between iterations.  After
>> a short time, something goes weird and OS is mostly hung, probably
>> because important locks are held while ath10k is timing out communication
>> to firmware.
>>
>> The last message I see from firmware is that it is deleting vdev 4.
>>
>> I do not see any indication that firmware is crashed, but something
>> is wrong, maybe mgt buffers are used up?
> [...]
>> [  342.962494] ath10k_pci 0000:04:00.0: failed to set erp slot for vdev 4: -11
>
> -11 = -EAGAIN = out of wmi-htc tx credits. I wonder what the dbg
> buffer is trying to say.
>
> Either host sent a corrupted message and clogged up firmware buffers,
> firmware is busy processing other commands (wmi mgmt tx, wmi bcn
> non-dma tx) or became confused/corrupted.

I finally got back to debugging this yesterday, and interestingly, when
I added dbglog calls in the firmware around the credit handling, the problem is 'fixed'.

Looks like it ran overnight, where as before it would fail within a few minutes.

So, maybe a race around pci memory flushing or something like that?

I'll slowly back out my debug today and see what I can see.

Thanks,
Ben

-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



More information about the ath10k mailing list