Anyone seeing tx-credits 'hang'?

Michal Kazior michal.kazior at tieto.com
Wed Jan 14 23:48:46 PST 2015


On 14 January 2015 at 18:57, Ben Greear <greearb at candelatech.com> wrote:
> On 01/14/2015 01:45 AM, Michal Kazior wrote:
>> On 13 January 2015 at 20:07, Ben Greear <greearb at candelatech.com> wrote:
>> [...]
>>>
>>> I managed to get some better debug out of the firmware.
>>>
>>> I am having a hell of a time figuring out how the code flows through all
>>> of the callbacks (in both firmware and driver), but it appears this is what happened:
>>>
>>> (I have instrumented transfer-id in both firmware and driver)
>>>
>>> firmware sent wmi message with transfer-id of 72.
>>> kernel received this transfer-id
>>> firmware's last send-callback transfer ID is 71.
>>>
>>> So, it seems that either ath10k did not do the transfer-complete logic,
>>> did it incorrectly, or the firmware did not notice it was done.
>>>
>>> I cannot find where the transfer complete code that should be updating
>>> firmware is at.  If you know, can you point me to it?
>>
>> I think the send-callback should be called when CE is simply done
>> doing it's stuff. There's no need for the other side to ack anything
>> explicitly (it just needs to have a free buffer on it's side so CE can
>> copy it over).
>>
>> Or maybe it is the HOST_IS_COPY_COMPLETE_MASK? Not really sure.
>
> I am now guessing that some magic IRQ happens when ath10k_ce_src_ring_write_index_set()
> is called.

Correct. CE should generate an interrupt (provided it's not masked in
CE registers) on the other end when ring index is bumped.


> I may have narrowed down the problem a bit further now.
>
> I printed out the ring indexes in firmware and driver when lockup
> occured.  The target -> host ring ids match fine, but I notice that
> it appears the firmware has pending entries in it's host -> target wmi
> ring that it has not consumed.
>
> Maybe it missed an irq or has some related race.

Hmm.. The host can tell the target it wants tx credit update in the
htc host->target buffer. Upstream ath10k does this only when spending
last tx credit. Your observation would explain why firmware doesn't
send tx credit update to the host - it didn't get to see the
need-credit-update. Does your tree modify behaviour of when is set
ATH10K_HTC_FLAG_NEED_CREDIT_UPDATE in ath10k?


> I'm going to try forcing a poll of the host -> target wmi queue in the
> firmware when it detects no wmi keep-alive messages and see if that kicks
> things back into action, and maybe see if I can find any reason for it
> to not properly handle the ring in the first place.

Did you try the old workaround ath10k had for hw1.0?


> If this works, perhaps there is a way to kick the ring from the driver
> side...maybe send a wmi command (ignoring quota) that has no affect,
> or something like that?

I think the wmi-echo could be suited for this. It probably doesn't use
any extra resources so overcommiting tx-credit to send it should be
safe.


Michał



More information about the ath10k mailing list