[PATCH] wifi: ath10k: poll service ready message before failing

Baochen Qiang quic_bqiang at quicinc.com
Wed Feb 21 18:18:55 PST 2024



On 2/21/2024 8:38 PM, James Prestwood wrote:
> Hi Baochen,
> 
> On 2/20/24 7:17 PM, Baochen Qiang wrote:
>> Currently host relies on CE interrupts to get notified that
>> the service ready message is ready. This results in timeout
>> issue if the interrupt is not fired, due to some unknown
>> reasons. See below logs:
>>
>> [76321.937866] ath10k_pci 0000:02:00.0: wmi service ready event not 
>> received
>> ...
>> [76322.016738] ath10k_pci 0000:02:00.0: Could not init core: -110
>>
>> And finally it causes WLAN interface bring up failure.
>>
>> Change to give it one more chance here by polling CE rings,
>> before failing directly.
>>
>> Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00157-QCARMSWPZ-1
>>
>> Fixes: 5e3dd157d7e7 ("ath10k: mac80211 driver for Qualcomm Atheros 
>> 802.11ac CQA98xx devices")
>> Reported-by: James Prestwood <prestwoj at gmail.com>
>> Link: 
>> https://lore.kernel.org/linux-wireless/304ce305-fbe6-420e-ac2a-d61ae5e6ca1a@gmail.com/
>> Signed-off-by: Baochen Qiang <quic_bqiang at quicinc.com>
>> ---
>>   drivers/net/wireless/ath/ath10k/wmi.c | 22 +++++++++++++++++++---
>>   1 file changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index ddf15717d504..bf6cb2c73128 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -1763,12 +1763,28 @@ void ath10k_wmi_put_wmi_channel(struct ath10k 
>> *ar, struct wmi_channel *ch,
>>   int ath10k_wmi_wait_for_service_ready(struct ath10k *ar)
>>   {
>> -    unsigned long time_left;
>> +    unsigned long time_left, i;
>>       time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
>>                           WMI_SERVICE_READY_TIMEOUT_HZ);
>> -    if (!time_left)
>> -        return -ETIMEDOUT;
>> +    if (!time_left) {
>> +        /* Sometimes the PCI HIF doesn't receive interrupt
>> +         * for the service ready message even if the buffer
>> +         * was completed. PCIe sniffer shows that it's
>> +         * because the corresponding CE ring doesn't fires
>> +         * it. Workaround here by polling CE rings once.
>> +         */
>> +        ath10k_warn(ar, "failed to receive service ready completion, 
>> polling..\n");
>> +
>> +        for (i = 0; i < CE_COUNT; i++)
>> +            ath10k_hif_send_complete_check(ar, i, 1);
>> +
>> +        time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
>> +                            WMI_SERVICE_READY_TIMEOUT_HZ);
>> +        if (!time_left)
>> +            return -ETIMEDOUT;
>> +    }
>> +
>>       return 0;
>>   }
>>
>> base-commit: 707e306f3573fa321ae197d77366578e4566cff5
> 
> Thank you for looking at this I will test this and see if it resolves 
> the problem we're seeing but since its somewhat rare it may take me a 
> bit to validate.
> 
> Is this any different than just trying to bring up the interface again 
> from userspace? I could be wrong, but my concern with this is that when 
> I retried in userspace things got into a very odd state:
> 
>   - IWD starts
> 
>   - ifdown interface
> 
>   - ifup interface, timeout -110
> 
>   - Retry ifup, success
> 
>   - Authenticate/associate succeed
> 
>   - 4-way handshake fails because the device never received the 1/4 frame.
> 
Don't get time to look into this case, but I suppose there might be some 
issues in error handling when interface up fails, kind of incorrect irq 
enable/disable or something else impacting data path, so no data frame 
received even after a second interface up retry succeeds,

Anyway please test this patch, which is supposed to be the right fix to 
this issue.

> IWD would then retry indefinitely with auth/assoc succeeding but never 
> receiving any 4-way handshake frames. The only way to get things working 
> again was reloading the ath10k driver/reboot. Maybe this patch is 
> different because its waiting for the initial request and no issuing a 
> second one? Just wanted to point that out in case it sheds any light.
> 
> Thanks,
> 
> James
> 



More information about the ath10k mailing list