[PATCH v3 0/3] Enable low power mode when WLAN is not active
Manikanta Pubbisetty
quic_mpubbise at quicinc.com
Thu Dec 8 03:20:43 PST 2022
On 12/8/2022 1:24 PM, Kalle Valo wrote:
> Manikanta Pubbisetty <quic_mpubbise at quicinc.com> writes:
>
>> On 11/23/2022 9:35 PM, Kalle Valo wrote:
>>
>>> Kalle Valo <kvalo at kernel.org> writes:
>>>
>>>> Manikanta Pubbisetty <quic_mpubbise at quicinc.com> writes:
>>>>
>>>>> Currently, WLAN chip is powered once during driver probe and is kept
>>>>> ON (powered) always even when WLAN is not active; keeping the chip
>>>>> powered ON all the time will consume extra power which is not
>>>>> desirable for battery operated devices. Same is the case with non-WoW
>>>>> suspend, chip will not be put into low power mode when the system is
>>>>> suspended resulting in higher battery drain.
>>>>>
>>>>> Send QMI MODE OFF command to firmware during WiFi OFF to put device
>>>>> into low power mode.
>>>>>
>>>>> Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1
>>>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16
>>>>>
>>>>> Manikanta Pubbisetty (3):
>>>>> ath11k: Fix double free issue during SRNG deinit
>>>>> ath11k: Move hardware initialization logic to start()
>>>>> ath11k: Enable low power mode when WLAN is not active
>>>>> ---
>>>>> V3:
>>>>> - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore
>>>>> - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes
>>>>> - Fixed other minor issues that were found during code review
>>>>> - Spelling corrections in the commit messages
>>>>
>>>> I still see a crash, immediately after the first rmmod:
>>>>
>>>> Nov 22 11:05:47 nuc2 [ 139.378719] rmmod ath11k_pci
>>>> Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably
>>>> for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP
>>>> DEBUG_PAGEALLOC KASAN
>>>> Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range
>>>> [0x00000000000001f0-0x00000000000001f7]
>>>>
>>>> Really odd that you don't see it. Unfortunately not able to debug this
>>>> further right now.
>>>>
>>>> This is with:
>>>>
>>>> wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
>>>
>>> A bit more information how I see the crash. So first I have all modules
>>> loaded:
>>>
>>> $ lsmod
>>> Module Size Used by
>>> ath11k_pci 57344 0
>>> ath11k 2015232 1 ath11k_pci
>>> mac80211 3284992 1 ath11k
>>> libarc4 16384 1 mac80211
>>> cfg80211 2494464 2 ath11k,mac80211
>>> qmi_helpers 57344 1 ath11k
>>> qrtr_mhi 20480 0
>>> mhi 217088 2 ath11k_pci,qrtr_mhi
>>> qrtr 98304 5 qrtr_mhi
>>> nvme 122880 3
>>> nvme_core 299008 5 nvme
>>> $
>>>
>>> Then I just remove ath11k_pci module and boom:
>>>
>>> $ sudo rmmod ath11k_pci
>>>
>>> [ 153.658409] general protection fault, probably for non-canonical
>>> address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>>> KASAN
>>>
>>> This happens every time, there doesn't seem to be any randomness on the
>>> behaviour.
>>>
>>
>> Thanks for the help Kalle, this is exactly what I was doing in my
>> tests. Unfortunately, I'm not able to reproduce the problem. I have
>> also tried with the exact firmware that you have pointed out. Let me
>> see if I'm missing anything.
>
> I tested this more and patch 3 seems to be the one causing the crash. I
> didn't see this when patch 1-2 were applied.
>
> The crash happens in ath11k_dp_process_rxdma_err() in this line:
>
> srng = &ab->hal.srng_list[err_ring->ring_id];
>
> ab looks sane to me (0xffff88814c960000) but err_ring is set to 0x200.
> Does this help?
>
Thanks for your time and analysis on the bug.
From the callstack() that you have shared earlier, it looks like
ath11k_dp_process_rxdma_err() is called from dp_service_srngs which is a
napi poll callback.
To me it looks like the napi handler of the driver is getting called
after the srng resources have been de-initialized during rmmod.
I have closed examined the changes in patch 3 (I'm still doing it). so
far, I could not find anything that could trigger this kind of a
problem. Since NAPI is disabled upon rmmod, NAPI poll should not be called.
It's quite not clear if I'm missing something.
Thanks,
Manikanta
More information about the ath11k
mailing list