[PATCH v2] ath10k: add flag to protect napi operation to avoid dead loop hang for SDIO

Wen Gong wgong at codeaurora.org
Tue Aug 25 05:58:44 EDT 2020


On 2020-08-25 16:24, Krishna Chaitanya wrote:
> On Tue, Aug 25, 2020 at 9:11 AM Wen Gong <wgong at codeaurora.org> wrote:
>> 
>> On 2020-08-24 19:15, Krishna Chaitanya wrote:
>> > On Mon, Aug 24, 2020 at 4:15 PM Wen Gong <wgong at codeaurora.org> wrote:
>> >>
>> >> On 2020-08-24 18:03, Krishna Chaitanya wrote:
>> >> > On Mon, Aug 24, 2020 at 3:10 PM Wen Gong <wgong at codeaurora.org> wrote:
>> >> >>
>> >> >> On 2020-08-24 16:35, Krishna Chaitanya wrote:
>> >> >> > On Mon, Aug 24, 2020 at 10:03 AM Wen Gong <wgong at codeaurora.org> wrote:
>> >> >> >>
>> >> >> >> It happened "Kernel panic - not syncing: hung_task: blocked tasks"
>> >> >> >> when
>> >> >> >> test simulate crash and ifconfig down/rmmod meanwhile.
>> >> >> >>
>> >> >> ...
>> >> >> >>
>> >> >> >>  #ifdef CONFIG_PM
>> >> >> > Even though your DUT is SDIO based we should be doing this in general
>> >> >> > for all, no?
>> >> >> > core_restart + hif_stop is common to all.
>> >> >> this patch does not have core_restart.
>> >> > I was referring to the combination which is causing the issue.
>> >> >
>> >> >> I dit not hit the issue for others bus(PCIe,SNOC...), so I can not
>> >> >> change them with a
>> >> >> assumption they also have this issue.
>> >> > But that doesn't make sense, the combination is being hit for others
>> >> > also.
>> >> > (they should also endup calling napi_disable twice?) or they are using
>> >> > some other check to avoid this (doesn't appear so from a quick look at
>> >> > the
>> >> > code).
>> >> Because I only use SDIO, I did not use others BUS, so I did not hit
>> >> the
>> >> issue
>> >> on other BUS.
>> > I understand, my point was based on the description the issue looks
>> > independent
>> > of the BUS type, so, the fix should also be generic. I understand that
>> > your testing
>> > is only focused on SDIO, but we should have a generic fix and probably
>> > use
>> > communities help to get it tested rather than fixing SDIO only.
>> I checked the ath10k, only sdio.c, snoc.c, pci.c have used napi.
>> I think it can change to move the
>> napi_synchronize/napi_disable/napi_enable from
>> sido.c/snoc.c/pci.c to ath10k_core.ko as below:
>> void ath10k_core_napi_enable(struct ath10k *ar)
>> {
>>         if (!ar->napi_enabled) {
>>                 napi_enable(&ar->napi);
>>                 ar->napi_enabled = true;
>>         }
>> }
>> EXPORT_SYMBOL(ath10k_core_napi_enable);
>> 
>> void ath10k_core_napi_disable_sync(struct ath10k *ar)
>> {
>>         if (ar->napi_enabled) {
>>                 napi_synchronize(&ar->napi);
>>                 napi_disable(&ar->napi);
>>                 ar->napi_enabled = false;
>>         }
>> }
>> EXPORT_SYMBOL(ath10k_core_napi_disable_sync);
>> 
>> is it appropriate?
>> ...
> Yes, this is perfect. One minor comment you can just do the
> check initially and return.
> 
> if (ar->napi_enabled)
>    return
Yes, I will change that.
But who can test for SNOC and PCIe, I have tested with SDIO, it is OK.
Govind, could you help to test SNOC?
If no people test PCIe, I can also test it.



More information about the ath10k mailing list