hacked 4.4.6+, 10.4.3 firmware, Running out of ring-index for pipe-id 3 (WMI).
Ben Greear
greearb at candelatech.com
Thu Mar 31 08:51:42 PDT 2016
On 03/30/2016 11:51 PM, Michal Kazior wrote:
> On 29 March 2016 at 17:48, Ben Greear <greearb at candelatech.com> wrote:
>> On 03/29/2016 01:05 AM, Michal Kazior wrote:
>>>
>>> On 28 March 2016 at 21:01, Ben Greear <greearb at candelatech.com> wrote:
>>>>
>>>> I'm seeing the ring-full messages below when running 35 stations on
>>>> modified 10.4.3 firmware. I also have serial console logging enabled, so
>>>> things are running a bit slow...this seems to exacerbate the issue.
>>>>
>>>> [ 91.108923] ath10k_pci 0000:05:00.0: htc failed hif-tx-sq: -105 eid: 2
>>>> credits: 1 ep->tx_credits: 1 credit-flow-enabled: 1
>>>> [ 91.108932] ath10k_pci 0000:05:00.0: could not request stats (type 128
>>>> ret -105)
>>>> [ 91.108942] ath10k_pci 0000:05:00.0: hif-tx-sg, full, nentries_mask:
>>>> 0x1f
>>>> write_idx: 2 sw-idx: 3 n_items: 1 pipe-id: 3
>>>> [ 91.108944] ath10k_pci 0000:05:00.0: htc failed hif-tx-sq: -105 eid: 2
>>>> credits: 1 ep->tx_credits: 1 credit-flow-enabled: 1
>>>> [ 91.108952] ath10k_pci 0000:05:00.0: could not request stats (type 1
>>>> ret
>>>> -105)
>>>> [ 91.108953] ath10k_pci 0000:05:00.0: failed to get fw stats for
>>>> ethtool:
>>>> -105
>>>> [ 91.109039] ath10k_pci 0000:05:00.0: hif-tx-sg, full, nentries_mask:
>>>> 0x1f
>>>> write_idx: 2 sw-idx: 3 n_items: 1 pipe-id: 3
>>>> [ 91.109041] ath10k_pci 0000:05:00.0: htc failed hif-tx-sq: -105 eid: 2
>>>> credits: 1 ep->tx_credits: 1 credit-flow-enabled: 1
>>>> [ 91.109050] ath10k_pci 0000:05:00.0: could not request stats (type 128
>>>> ret -105)
>>>> [ 91.109060] ath10k_pci 0000:05:00.0: hif-tx-sg, full, nentries_mask:
>>>> 0x1f
>>>> write_idx: 2 sw-idx: 3 n_items: 1 pipe-id: 3
>>>> [ 91.109062] ath10k_pci 0000:05:00.0: htc failed hif-tx-sq: -105 eid: 2
>>>> credits: 1 ep->tx_credits: 1 credit-flow-enabled: 1
>>>> [ 91.109070] ath10k_pci 0000:05:00.0: could not request stats (type 1
>>>> ret
>>>> -105)
>>>> [ 91.109072] ath10k_pci 0000:05:00.0: failed to get fw stats for
>>>> ethtool:
>>>> -105
>>>> [ 91.109157] ath10k_pci 0000:05:00.0: hif-tx-sg, full, nentries_mask:
>>>> 0x1f
>>>> write_idx: 2 sw-idx: 3 n_items: 1 pipe-id: 3
>>>> [ 91.109160] ath10k_pci 0000:05:00.0: htc failed hif-tx-sq: -105 eid: 2
>>>> credits: 1 ep->tx_credits: 1 credit-flow-enabled: 1
>>>>
>>>>
>>>> I am struggling to understand how the pipe can be full since we have
>>>> tx-credits logic
>>>> enabled for the WMI pipe.
>>>>
>>>> Any suggestions on what sort of bugs could cause this?
>>>>
>>>> And, should the ath10k_wmi_cmd_send retry when we get a -105 return
>>>> code in hopes it will free up shortly instead of just failing and leaving
>>>> the system in invalid state?
>>>
>>>
>>> It probably shouldn't. As you've pointed out HTC tx credits should
>>> prevent this in the first place. If you see -105 it means something is
>>> really broken and needs to be fixed properly.
>>>
>>> A thing that comes to mind is that CE -for whatever reason- would need
>>> to stop completing CE ring items. Are you running with MSI? 1 or
>>> multiple interrupts? Did you try forcing legacy interrupt mode to rule
>>> out MSI problems?
>>>
>>> You could add a debug messages to see if the HTC-WMI CE ring gets tx
>>> completions properly.
>>
>>
>> I don't think I'm using MSI. Could it be that whatever logic that should
>> be processing the tx-completions is just running slower than whatever is
>> handling the WMI messages (and credits)?
>
> Your WMI command queue is limited to HTC Tx credits (2, right?). This
> means you can enqueue, in practice, 2 CE items to WMI's CE Tx pipe.
> Once you've done that you have to wait until next interrupt carrying
> HTC Rx message with Tx Credit Update. If you get this it implies FW
> received your WMI commands which implies WMI's CE Tx pipe was updated
> (and at least the 2 CE's associated with your WMI commands have been
> consumed/completed). Even if you assume CE processing ordering is
> reversed (i.e. HTC Rx gets processed before HTC Tx completions are)
> you still should be able to have enqueued no more than 4 CE items at a
> time as far as WMI is concerned.
>
> Now, if you assume MSI-range (multiple MSI interrupts; a vector) is
> enabled, you can service each CE pipe in a separate interrupt and
> tasklet. This could, in theory, result in some weird race as HTC Tx
> credits and CE Tx pipe completions are not guaranteed to be
> serialized.
That could be what happened.
>
> Or maybe you're using some forced WMI commands in your fork and
> disregard Tx credits in some cases? This could explain the problem
> even when running with a single interrupt.
I'm not forcing WMI commands, and if I were really managing to queue up 30
WMI cmds in the firmware, it seems it would have crashed the firmware since
it does not contain many local resources (ie, only 2).
Thanks,
Ben
--
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc http://www.candelatech.com
More information about the ath10k
mailing list