Anyone seeing tx-credits 'hang'?

Peter Oh poh at
Wed Jan 14 17:54:01 PST 2015

On 01/14/2015 01:22 PM, Ben Greear wrote:
> On 01/14/2015 12:50 PM, Peter Oh wrote:
>> On 01/14/2015 09:57 AM, Ben Greear wrote:
>>> On 01/14/2015 01:45 AM, Michal Kazior wrote:
>>>> On 13 January 2015 at 20:07, Ben Greear <greearb at> wrote:
>>>> [...]
>>>>> I managed to get some better debug out of the firmware.
>>>>> I am having a hell of a time figuring out how the code flows through all
>>>>> of the callbacks (in both firmware and driver), but it appears this is what happened:
>>>>> (I have instrumented transfer-id in both firmware and driver)
>>>>> firmware sent wmi message with transfer-id of 72.
>>>>> kernel received this transfer-id
>>>>> firmware's last send-callback transfer ID is 71.
>>>>> So, it seems that either ath10k did not do the transfer-complete logic,
>>>>> did it incorrectly, or the firmware did not notice it was done.
>>>>> I cannot find where the transfer complete code that should be updating
>>>>> firmware is at.  If you know, can you point me to it?
>>>> I think the send-callback should be called when CE is simply done
>>>> doing it's stuff. There's no need for the other side to ack anything
>>>> explicitly (it just needs to have a free buffer on it's side so CE can
>>>> copy it over).
>>>> Or maybe it is the HOST_IS_COPY_COMPLETE_MASK? Not really sure.
>>> I am now guessing that some magic IRQ happens when ath10k_ce_src_ring_write_index_set()
>>> is called.
>> You may already notice it, but to clarify the magic IRQ is DMA interrupts. Copy Engine is almost the same as DMA engine with channels which triggers an
>> interrupt automatically when a DMA transfer is completed. we have registers to enable it, HOST_IE (offset 0x2c) and TARGET_IE(offset 0x24).
>> ath10k_ce_src_ring_write_index_set (SRC_RING_WR_IND register, offset 0x3c) triggers fetching data automatically using DMA by ASIC design.
> Yes, that makes sense, and I appreciate the extra details.
>>> I may have narrowed down the problem a bit further now.
>>> I printed out the ring indexes in firmware and driver when lockup
>>> occured.  The target -> host ring ids match fine, but I notice that
>>> it appears the firmware has pending entries in it's host -> target wmi
>>> ring that it has not consumed.
>>> Maybe it missed an irq or has some related race.
>> Since the IRQ is a DMA interrupt triggered by ASIC, all the amount of data size must be transferred to trigger the interrupt. If IRQ does not happen even after
>> all the data transferred, then we may call it an ASIC bug otherwise it could be software issues. The corresponding status register is TARGET_IS (offset 0x28)
>> and HOST_IS (offset 0x30), but I'm not sure which registers represent the number of bytes has been transferred. If we have this type of register, it will be
>> easy to determine if DMA is done.
> I found some things that look risky in the firmware CE code, but my attempts at
> fixing them made no improvement, so I am not sure I found any real problems in
> this area yet.  I'll be happy to send you the firmware patches for my debugging
> efforts and such if you are interested.
sure. I'd like to run your changes, but I cannot guarantee how much 
efforts by when I give work on.
> As for when bytes are fully read, see this firmware method:
> CE_completed_recv_next
> At this point, I am trying to make a work-around that will force a re-read of the ring
> buffer (basically, fake an interrupt).
> Back to the original attempt at debugging this...the problem was quite easy to reproduce
> before I started adding debugging to the firmware..and the debugging I have added is quite light on
> run-time behaviour, so I suspect some sort of race either in software or hardware.
> Hard to pin it down though.
> Out of curiosity, are you aware of anyone hitting this type of problem with upstream
> firmware?
sorry, but I don't see people address this issue.
> Thanks,
> Ben

More information about the ath10k mailing list