Deadlock on (faked) firmware crash, CUS239, modified 10.4.3 firmware.

Michal Kazior michal.kazior at tieto.com
Wed Mar 30 23:32:52 PDT 2016


On 31 March 2016 at 00:28, Ben Greear <greearb at candelatech.com> wrote:
>
>> Hmm.. If it still reproduces can you try the following diff?
>>
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -3780,6 +3780,8 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
>>                  list_del_init(&artxq->list);
>>                  if (ret != -ENOENT)
>>                          list_add_tail(&artxq->list, &ar->txqs);
>> +               else if (artxq == last)
>> +                       last = list_last_entry(&ar->txqs, struct
>> ath10k_txq, list);
>>
>>                  ath10k_htt_tx_txq_update(hw, txq);
>
>
> Ok, I added this code, and can still reproduce the code.
>
> Firmware is crashing multiple times a minute in this machine in it's
> current configuration.  Right before it hung, firmware crashed and
> was restarted, and then I get the hang notification.
>
> I don't see any obvious bail-out in the tx_push_pending logic
> if the firmware crashes?

There's no explicit bail-out, yes. It should bail out if
ath10k_mac_tx_push_txq() fails though (except -ENOENT, which is
treated slightly differently but should result in bail-out eventually
as well as ar->txqs will drain until it's empty).

HTT-tx doesn't check for FW crash but it should be ultimately limited
by either CE ring size and HTT's num-pending-tx (both should not be
replenished as FW crashed and interrupts should not come in anymore).
Whichever the case a <0 retval should result in a bailout.


Michał



More information about the ath10k mailing list