[PATCH v3] ath10k: add flag to protect napi operation to avoid dead loop hang

Kalle Valo kvalo at codeaurora.org
Tue Dec 15 03:05:35 EST 2020


Wen Gong <wgong at codeaurora.org> writes:

> On 2020-12-09 23:00, Ben Greear wrote:
>> On 12/9/20 1:24 AM, Kalle Valo wrote:
>>> Wen Gong <wgong at codeaurora.org> writes:
>>>
>>>> On 2020-09-08 00:22, Kalle Valo wrote:
>>>>
>>>>> Just like with the recent firmware restart patch, isn't
>>>>> ar->napi_enabled
>>>>> racy? Wouldn't test_and_set_bit() and test_and_clear_bit() be safer?
>>>>>
>>>>> Or are we holding a lock? But then that should be documented with
>>>>> lockdep_assert_held().
>>>>
>>>> yes, ath10k_hif_start is only called from ath10k_core_start, it has
>>>> "lockdep_assert_held(&ar->conf_mutex)", and ath10k_hif_stop is only
>>>> called from ath10k_core_stop, it also has
>>>> "lockdep_assert_held(&ar->conf_mutex)". then it will not 2 thread
>>>> both
>>>> enter ath10k_hif_start/ath10k_hif_stop meanwhile.
>>>
>>> Ok, but every function depending on a lock being held should still
>>> call
>>> lockdep_assert_held(), that way we can catch the bug if locking
>>> changes
>>> later. So it's not enough that ath10k_core_stop() has
>>> lockdep_assert_held(), also these napi functions should have it.
>>>
>>> I actually decided to switch using ATH10K_FLAG_NAPI_ENABLED with
>>> set_bit() & co, simpler locking that way and no lockdep_assert_held()
>>> needed anymore. Please check my changes in the pending branch, I have
>>> only compile tested them:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=e0a466d296bd862080f7796b41349f9f586272c9
>>>
>>
>> Why do you not need locking?  You can't just check a bit is set and
>> then do work and set
>> it later without locking, two concurrent CPU threads can pass the
>> first check and both get into
>> the logic below it?
>>
> maybe because which I said before:
>
> ath10k_hif_start is only called from ath10k_core_start, it has
> "lockdep_assert_held(&ar->conf_mutex)", and ath10k_hif_stop is only
> called from ath10k_core_stop, it also has
> "lockdep_assert_held(&ar->conf_mutex)". then it will not 2 thread both
> enter ath10k_hif_start/ath10k_hif_stop meanwhile.

Yeah, but that was not visible from the code. I now changed the patch in
pending branch that this is clearly documented with
lockdep_assert_held() and lockdep will warn if someone breaks the
locking later on.

If a function relies on a lock being held, lockdep_assert_held() needs
to be _always_ used to make the locking dependencies clearly visible.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



More information about the ath10k mailing list