[RFC v2 05/11] ath10k: htc: refactorization

Michal Kazior michal.kazior at tieto.com
Tue Dec 13 11:21:53 PST 2016


On 13 December 2016 at 19:37, Erik Stromdahl <erik.stromdahl at gmail.com> wrote:
>
>
> On 12/13/2016 06:26 PM, Valo, Kalle wrote:
>> Michal Kazior <michal.kazior at tieto.com> writes:
>>
>>> On 13 December 2016 at 14:44, Valo, Kalle <kvalo at qca.qualcomm.com> wrote:
>>>> Erik Stromdahl <erik.stromdahl at gmail.com> writes:
>>>>
>>>>> Code refactorization:
>>>>>
>>>>> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
>>>>> to ath10k_htc_control_rx_complete.
>>>>>
>>>>> This eases the implementation of SDIO/mbox significantly since
>>>>> the ep_rx_complete cb is invoked directly from the SDIO/mbox
>>>>> hif layer.
>>>>>
>>>>> Since the ath10k_htc_control_rx_complete already is present
>>>>> (only containing a warning message) there is no reason for not
>>>>> using it (instead of having a special case for ep 0 in
>>>>> ath10k_htc_rx_completion_handler).
>>>>>
>>>>> Signed-off-by: Erik Stromdahl <erik.stromdahl at gmail.com>
>>>>
>>>> I tested this on QCA988X PCI board just to see if there are any
>>>> regressions. It crashes immediately during module load, every time, and
>>>> bisected that the crashing starts on this patch:
>>>>
>>>> [ 1239.715325] ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
>>>> [ 1239.885125] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:02:00.0.bin failed with error -2
>>>> [ 1239.885260] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/cal-pci-0000:02:00.0.bin failed with error -2
>>>> [ 1239.885687] ath10k_pci 0000:02:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043202ff sub 0000:0000
>>>> [ 1239.885699] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
>>>> [ 1239.885899] ath10k_pci 0000:02:00.0: firmware ver 10.2.4.70.59-2 api 5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
>>>> [ 1239.941836] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>>>> [ 1239.941993] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
>>>> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   (null)
>>>> [ 1241.136738] IP: [<  (null)>]   (null)
>>>> [ 1241.136759] *pdpt = 0000000000000000 *pde = f0002a55f0002a55 [ 1241.136781]
>>>> [ 1241.136793] Oops: 0010 [#1] SMP
>>>>
>>>> What's odd is that when I added some printks on my own and enabled both
>>>> boot and htc debug levels it doesn't crash anymore. After everything
>>>> works normally after that, I can start AP mode and connect to it. Is it
>>>> a race somewhere?
>>>
>>> Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
>>> is set in htc_wait_target() [changed patch 4, but still too late].
>>>
>>> ep_rx_complete must be set prior to calling hif_start(). You probably
>>> crash on end of ath10k_htc_rx_completion_handler() when trying to call
>>> ep->ep_ops.ep_rx_complete(ar, skb).
>>
>> Yeah, just checked and ep->ep_ops.ep_rx_complete is NULL at the end of
>> ath10k_htc_rx_completion_handler().
>>
> It is indeed correct as Michal points out, there is a risk that the
> first HTC control message (typically an HTC ready message) is received
> before the HTC control endpoint is connected.
>
> I have experienced a similar race with my SDIO implementation as well.
> In this case I did solve the issue by enabling HIF target interrupts
> after the HTC control endpoint was connected. I am not sure however if
> this is the most elegant way to solve this problem.
>
> My SDIO target won't send the HTC ready message before this is done.
> The fix essentially consists of moving the ..._irg_enable call from
> hif_start into another hif op.

It makes more sense to move ep_rx_complete setup/assignment before
hif_start(). This assignment should be done very early as there is
nothing to change/override for this endpoint during operation, is
there? It's known what it needs to store very early on.


> I have made a few updates since I submitted the original RFC and created
> a repo on github:
>
> https://github.com/erstrom/linux-ath
>
> I have a bunch of branches that are all based on the tags on the ath master.
>
> As of this moment the latest version is:
>
> ath-201612131156-ath10k-sdio
>
> This branch contains the original RFC patches plus some addons/fixes.
>
> In the above mentioned branch there are a few commits related to this
> race condition. Perhaps you can have a look at them?
>
> The commits are:
> 821672913328cf737c9616786dc28d2e4e8a4a90

I would avoid if(bus==xx) checks.


> dd7fcf0a1f78e68876d14f90c12bd37f3a700ad7
> 7434b7b40875bd08a3a48a437ba50afed7754931
>
> Perhaps this approach can work with PCIe as well?

I think I did contemplate the unmask/start distinction at some point
but I didn't go with it for some reason I can't recall now.


Michał



More information about the ath10k mailing list