[PATCH v3 0/3] Enable low power mode when WLAN is not active

Manikanta Pubbisetty quic_mpubbise at quicinc.com
Thu Dec 8 03:20:43 PST 2022


On 12/8/2022 1:24 PM, Kalle Valo wrote:
> Manikanta Pubbisetty <quic_mpubbise at quicinc.com> writes:
> 
>> On 11/23/2022 9:35 PM, Kalle Valo wrote:
>>
>>> Kalle Valo <kvalo at kernel.org> writes:
>>>
>>>> Manikanta Pubbisetty <quic_mpubbise at quicinc.com> writes:
>>>>
>>>>> Currently, WLAN chip is powered once during driver probe and is kept
>>>>> ON (powered) always even when WLAN is not active; keeping the chip
>>>>> powered ON all the time will consume extra power which is not
>>>>> desirable for battery operated devices. Same is the case with non-WoW
>>>>> suspend, chip will not be put into low power mode when the system is
>>>>> suspended resulting in higher battery drain.
>>>>>
>>>>> Send QMI MODE OFF command to firmware during WiFi OFF to put device
>>>>> into low power mode.
>>>>>
>>>>> Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1
>>>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16
>>>>>
>>>>> Manikanta Pubbisetty (3):
>>>>>     ath11k: Fix double free issue during SRNG deinit
>>>>>     ath11k: Move hardware initialization logic to start()
>>>>>     ath11k: Enable low power mode when WLAN is not active
>>>>> ---
>>>>> V3:
>>>>>    - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore
>>>>>    - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes
>>>>>    - Fixed other minor issues that were found during code review
>>>>>    - Spelling corrections in the commit messages
>>>>
>>>> I still see a crash, immediately after the first rmmod:
>>>>
>>>> Nov 22 11:05:47 nuc2  [  139.378719] rmmod ath11k_pci
>>>> Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably
>>>> for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP
>>>> DEBUG_PAGEALLOC KASAN
>>>> Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range
>>>> [0x00000000000001f0-0x00000000000001f7]
>>>>
>>>> Really odd that you don't see it. Unfortunately not able to debug this
>>>> further right now.
>>>>
>>>> This is with:
>>>>
>>>> wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
>>>
>>> A bit more information how I see the crash. So first I have all modules
>>> loaded:
>>>
>>> $ lsmod
>>> Module                  Size  Used by
>>> ath11k_pci             57344  0
>>> ath11k               2015232  1 ath11k_pci
>>> mac80211             3284992  1 ath11k
>>> libarc4                16384  1 mac80211
>>> cfg80211             2494464  2 ath11k,mac80211
>>> qmi_helpers            57344  1 ath11k
>>> qrtr_mhi               20480  0
>>> mhi                   217088  2 ath11k_pci,qrtr_mhi
>>> qrtr                   98304  5 qrtr_mhi
>>> nvme                  122880  3
>>> nvme_core             299008  5 nvme
>>> $
>>>
>>> Then I just remove ath11k_pci module and boom:
>>>
>>> $ sudo rmmod ath11k_pci
>>>
>>> [ 153.658409] general protection fault, probably for non-canonical
>>> address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>>> KASAN
>>>
>>> This happens every time, there doesn't seem to be any randomness on the
>>> behaviour.
>>>
>>
>> Thanks for the help Kalle, this is exactly what I was doing in my
>> tests. Unfortunately, I'm not able to reproduce the problem. I have
>> also tried with the exact firmware that you have pointed out. Let me
>> see if I'm missing anything.
> 
> I tested this more and patch 3 seems to be the one causing the crash. I
> didn't see this when patch 1-2 were applied.
> 
> The crash happens in ath11k_dp_process_rxdma_err() in this line:
> 
> 	srng = &ab->hal.srng_list[err_ring->ring_id];
> 
> ab looks sane to me (0xffff88814c960000) but err_ring is set to 0x200.
> Does this help?
> 

Thanks for your time and analysis on the bug.

 From the callstack() that you have shared earlier, it looks like 
ath11k_dp_process_rxdma_err() is called from dp_service_srngs which is a
napi poll callback.

To me it looks like the napi handler of the driver is getting called
after the srng resources have been de-initialized during rmmod.

I have closed examined the changes in patch 3 (I'm still doing it). so 
far, I could not find anything that could trigger this kind of a 
problem. Since NAPI is disabled upon rmmod, NAPI poll should not be called.

It's quite not clear if I'm missing something.

Thanks,
Manikanta



More information about the ath11k mailing list