More than one ath10k NIC will not load (bisected)

Ben Greear greearb at candelatech.com
Wed Feb 4 22:41:47 PST 2015


On 02/04/2015 10:02 PM, Michal Kazior wrote:
> On 4 February 2015 at 16:23, Ben Greear <greearb at candelatech.com> wrote:
>> On 02/04/2015 01:35 AM, Michal Kazior wrote:
>>>
>>> On 4 February 2015 at 10:07, Kalle Valo <kvalo at qca.qualcomm.com> wrote:
>>>>
>>>> Ben Greear <greearb at candelatech.com> writes:
>>>>
>>>>>> Hmm.. This removes warm_reset in probe function but I fail to see how
>>>>>> this could end up not loading one of the NIC *silently*?
>>>>>>
>>>>>> Anyway there's a pending patch which adds the reset back:
>>>>>>
>>>>>> https://github.com/kvalo/ath/commit/bdcd6f4e4ac5d2d2a56da4813f56655e6db0ee45
>>>>>> . You might want to try it and see if it helps.
>>>>>
>>>>>
>>>>> Reverting the patch made it work again for me.
>>>>>
>>>>> I don't understand that code well, but perhaps you are disabling
>>>>> a shared interrupt that silently stops the second NIC from
>>>>> being able to do it's thing?
>>>>>
>>>>> Do you have a PC with 2 NICs in it that you could try yourself?
>>>>>
>>>>> I can grab you the logs of a failure to boot later today.
>>>>
>>>>
>>>> What should we do with this one? I didn't look at the details yet, but
>>>> do we have any other option than to revert?
>>>
>>>
>>> I believe this is an issue in Ben's userspace (he sent me logs
>>> privately) or some sort of kernel event bug. It basically looked like
>>> this: both devices were detected by ath10k and both started
>>> register_work. One of the devices loaded all the way while the other
>>> tried to load a few non-existing firmware files and it stopped. Few
>>> minutes later there was a hung task splat pointing to
>>> request_firmware() called from ath10k suggesting userspace didn't
>>> handle firmware request.
>>>
>>> The "offending" patch effectively removed 200ms from probe() in
>>> ath10k. This could've change timing on request_firmware() calls on
>>> Ben's system. Btw. the 200ms is back again now with
>>> 1a7fecb766c83dace747f42b25bbb544b00a0163 ("ath10k: reset chip before
>>> reading chip_id in probe").
>>>
>>> Marek tried running 2 qca988x on his laptop some time ago (with and
>>> without the extra timing) and didn't have any issues.
>>
>>
>> I can retry my system with stock Fedora 20 and see if it works there.
>>
>> If not, then I think it still needs to be worked on...you agree?
>
> In the driver? I'd argue. I don't see how ath10k could make
> request_firmware() hang, do you?

I can lard up the kernel with lockdep and related things and see if
that offers some clues.  Will be a bit though, I'm pretty busy
with other things at the moment.

Thanks,
Ben

>
>
> Michał
>


-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com




More information about the ath10k mailing list