Deadlock on (faked) firmware crash, CUS239, modified 10.4.3 firmware.

Ben Greear greearb at candelatech.com
Thu Mar 31 22:33:40 PDT 2016



On 03/31/2016 10:26 PM, Michal Kazior wrote:
> On 31 March 2016 at 21:16, Ben Greear <greearb at candelatech.com> wrote:
> [...]
>> I tried adding check for FW crash yesterday, but that did not help.
>>
>> Today, I added a limit of 2000 loops.  I see that hit, and then kernel
>> crashes.  Maybe my patch is wrong.
>>
>> I've tried to apply (almost) every patch in linux.ath related to ath10k,
>> including a few from the mailing list that have not been applied yet.
>>
>> My push-pending method now looks like this:
>>
>> void ath10k_mac_tx_push_pending(struct ath10k *ar)
>> {
> [...]
>> }
>
> Looks sane.
>
>
>> The crash I get is this:
>>
>>
>> ath10k_pci 0000:05:00.0: firmware crashed! (uuid
>> 2a118708-977d-43d6-8d40-079ddec99eb3)
> [...]
>> BUG: unable to handle kernel paging request at 0000000000001000
>> IP: [<ffffffffa08e9810>] __skb_dequeue+0x2e/0x37 [mac80211]
>
> Hmm.. Do you have 2a58d42c1e01 ("mac80211: fix txq queue related
> crashes") applied?

Yes, though it is a different hash in my tree, probably merge issues.

See the patches I posted today to fix stale access to peer objects,
that seems to have fixed these problems for me, or at least made
it much harder to hit.

At quitting time, I was still seeing kasan errors in mac80211
stats logic, so there are more bugs waiting for tomorrow.

Thanks,
Ben


-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



More information about the ath10k mailing list