wifi: ath12k: start-up crash with WCN7850 hw2.0 on TI AM69-SK board
Baochen Qiang
quic_bqiang at quicinc.com
Wed Feb 19 02:18:17 PST 2025
On 2/5/2025 10:20 AM, Baochen Qiang wrote:
>
>
> On 1/27/2025 10:01 PM, Parth Panchoil wrote:
>> Hi,
>>
>> I am currently debugging the ath12k_pci_enable_ltssm start up crash/bug
>> with the mainline kernel on my system and would like to share my
>> observations so far:
>>
>> The ath12k mainline driver gets stuck at this specific line:
>> https://github.com/torvalds/linux/blob/9c5968db9e625019a0ee5226c7eebef5519d366a/drivers/net/wireless/ath/ath12k/pci.c#L295
>> in the ath12k_pci_enable_ltssm which attempts to read
>> GCC_GCC_PCIE_HOT_RST, particularly
>> https://github.com/torvalds/linux/blob/9c5968db9e625019a0ee5226c7eebef5519d366a/drivers/net/wireless/ath/ath12k/pci.c#L1209
>
> thanks for the narrow down, really helpful.
>
> We internally have observed this issue, although at a different line:
>
> https://github.com/torvalds/linux/blob/9c5968db9e625019a0ee5226c7eebef5519d366a/drivers/net/wireless/ath/ath12k/pci.c#L298
>
> For now I am suspecting that GCC_GCC_PCIE_HOT_RST is not a valid register on WLAN target
> side, I will check internally and get back.
Parth, could you do below change and try again?
-#define GCC_GCC_PCIE_HOT_RST 0x1e38338
+#define GCC_GCC_PCIE_HOT_RST 0x1e40304
>
>>
>> Interestingly, within the same function, the line val =
>> ath12k_pci_read32(ab, PCIE_PCIE_PARF_LTSSM) successfully reads the
>> expected value 0x111 for PCIE_PCIE_PARF_LTSSM.
>>
>> I am continuing to debug from my end, although my understanding of the
>> ath12k driver is limited. Any leads, suggestions, or hints to help
>> resolve this issue would be greatly appreciated.
>>
>> Thank you.
>>
>> Regards,
>> Parth P
>>
>>
>> On Fri, 2025-01-24 at 10:02 +0000, Parth Pancholi wrote:
>>> I appreciate your response, Baochen.
>>>
>>> I have been working on enabling mainline kernel support on my TI
>>> AM69-
>>> SK board to test the mainline ath12k driver on my system.
>>>
>>> Using the mainline kernel repository for the ath drivers [1], I made
>>> the following observation:
>>> While the exact crash observed earlier is no longer present, the
>>> system
>>> hangs upon loading the ath12k mainline driver, displaying the
>>> messages
>>> below.
>>>
>>> root at am69-sk:~# modprobe ath12k debug_mask=0xffffffff
>>> [ 1121.996554] ath12k_pci 0000:01:00.0: BAR 0 [mem 0x4410200000-
>>> 0x44103fffff 64bit]: assigned
>>> [ 1122.004884] ath12k_pci 0000:01:00.0: enabling device (0000 ->
>>> 0002)
>>> [ 1122.011818] ath12k_pci 0000:01:00.0: MSI vectors: 16
>>> [ 1122.016798] ath12k_pci 0000:01:00.0: Hardware name: wcn7850 hw2.0
>>> [ 1122.040183] NET: Registered PF_QIPCRTR protocol family
>>>
>>> root at am69-sk:~# uname -a
>>> Linux am69-sk 6.13.0-rc7-wt-ath-ge7ef944b3e2c-dirty #2 SMP PREEMPT
>>> Wed
>>> Jan 22 16:55:17 CET 2025 aarch64 GNU/Linux
>>>
>>> root at am69-sk:~# lspci
>>> 0000:00:00.0 PCI bridge: Texas Instruments Device b012
>>> 0000:01:00.0 Network controller: Qualcomm Technologies, Inc WCN785x
>>> Wi-
>>> Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] (rev 01)
>>> 0001:00:00.0 PCI bridge: Texas Instruments Device b012
>>> 0002:00:00.0 PCI bridge: Texas Instruments Device b012
>>>
>>> Do you have any insights into what might still be missing or
>>> incorrect
>>> in my setup?
>>>
>>> Regards,
>>> Parth P
>>>
>>> On Wed, 2025-01-22 at 15:20 +0800, Baochen Qiang wrote:
>>>>
>>>>
>>>> On 1/21/2025 10:19 PM, Parth Panchoil wrote:
>>>>> Hi All,
>>>>>
>>>>> I am performing tests on the SX-PCEBE Wi-Fi module, which
>>>>> utilizes
>>>>> the
>>>>> ATH12k driver, on the Texas Instruments AM69-SK board.
>>>>> The board is running the TI Linux Kernel from the ti-linux-6.6.y
>>>>
>>>> 6.6 is too old, and besides we don;t support customer kernel.
>>>>
>>>> Could you try latest ath tree [1] or the mainline tree [2]?
>>>>
>>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/ath/ath.git/
>>>> [2]
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
>>>>
>>>> If the issue is still seen, please enable verbose ath12k log using
>>>> below command and help
>>>> collect dmesg logs:
>>>>
>>>> sudo modprobe ath12k debug_mask=0xffffffff
>>>>
>>>> One more thing, the open-WRT patch is overkill, can you narrow down
>>>> to find which line of
>>>> code in ath12k_pci_enable_ltssm() is causing this issue?
>>>>
>>>>
>>>>> branch. During testing, I observed a kernel crash from the ATH12k
>>>>> driver as soon as the probe is called. The crash log is as
>>>>> follows:
>>>>>
>>>>> [ 9.492631] Kernel panic - not syncing: Asynchronous SError
>>>>> Interrupt
>>>>> [ 9.492634] CPU: 7 PID: 222 Comm: (udev-worker) Not tainted
>>>>> 6.6.58-
>>>>> 01497-ga7758da17c28-dirty #1
>>>>> [ 9.492638] Hardware name: Texas Instruments AM69 SK (DT)
>>>>> [ 9.492640] Call trace:
>>>>> [ 9.492642] dump_backtrace+0x94/0xec
>>>>> [ 9.492658] show_stack+0x18/0x24
>>>>> [ 9.492662] dump_stack_lvl+0x48/0x60
>>>>> [ 9.492669] dump_stack+0x18/0x24
>>>>> [ 9.492672] panic+0x320/0x378
>>>>> [ 9.492677] nmi_panic+0x8c/0x90
>>>>> [ 9.492681] arm64_serror_panic+0x6c/0x78
>>>>> [ 9.492686] do_serror+0x3c/0x78
>>>>> [ 9.492692] el1h_64_error_handler+0x34/0x4c
>>>>> [ 9.492697] el1h_64_error+0x64/0x68
>>>>> [ 9.492700] ath12k_pci_read32+0x1bc/0x1e8 [ath12k]
>>>>> [ 9.492725] ath12k_pci_power_up+0xdc/0x340 [ath12k]
>>>>> [ 9.492747] ath12k_core_init+0x2c/0xa8 [ath12k]
>>>>> [ 9.492769] ath12k_pci_probe+0x698/0x908 [ath12k]
>>>>> [ 9.492791] pci_device_probe+0xa8/0x16c
>>>>> [ 9.492800] really_probe+0x110/0x27c
>>>>> [ 9.492805] __driver_probe_device+0x78/0x12c
>>>>> [ 9.492808] driver_probe_device+0x3c/0x118
>>>>> [ 9.492810] __driver_attach+0x74/0x124
>>>>> [ 9.492813] bus_for_each_dev+0x78/0xd8
>>>>> [ 9.492819] driver_attach+0x24/0x30
>>>>> [ 9.492824] bus_add_driver+0xe4/0x208
>>>>> [ 9.492828] driver_register+0x60/0x128
>>>>> [ 9.492831] __pci_register_driver+0x44/0x50
>>>>> [ 9.492835] ath12k_pci_init+0x2c/0x6c [ath12k]
>>>>> [ 9.492858] do_one_initcall+0x70/0x1b4
>>>>> [ 9.492861] do_init_module+0x58/0x1e4
>>>>> [ 9.492867] load_module+0x19bc/0x1a8c
>>>>> [ 9.492869] init_module_from_file+0x88/0xc4
>>>>> [ 9.492873] __arm64_sys_finit_module+0x1c0/0x2ac
>>>>> [ 9.492877] invoke_syscall+0x44/0x108
>>>>> [ 9.492882] el0_svc_common.constprop.0+0xc0/0xe0
>>>>> [ 9.492885] do_el0_svc+0x1c/0x28
>>>>> [ 9.492889] el0_svc+0x2c/0x84
>>>>> [ 9.492892] el0t_64_sync_handler+0xc0/0xc4
>>>>> [ 9.492895] el0t_64_sync+0x190/0x194
>>>>> [ 9.492899] SMP: stopping secondary CPUs
>>>>> [ 9.492908] Kernel Offset: disabled
>>>>> [ 9.492909] CPU features: 0x0,80000200,28020000,1000420b
>>>>> [ 9.492913] Memory Limit: none
>>>>>
>>>>> Upon searching online, I found the OpenWRT patch that appears to
>>>>> address a similar issue: OpenWRT Patch: Prevent LTSSM Startup
>>>>> Crash.
>>>>> https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=package/kernel/mac80211/patches/ath12k/100-ath12k-prevent-ltssm-startup-crash.patch;h=cd85a0f6aa2652d62bfbea04e9bcca3bcf831b7f;hb=935b2b7dcef61b2893ed5dff307dd8f8a1156899
>>>>> With the above patch applied, I do not see the crash anymore.
>>>>>
>>>>> Could anyone confirm if this issue has been reported before/known
>>>>> bug
>>>>> or provide any insights?
>>>>> Any additional information or suggestions would be greatly
>>>>> appreciated.
>>>>>
>>>>> Details about the test setup,
>>>>> TI-AM69-SK board:
>>>>> https://www.ti.com/tool/SK-AM69?keyMatch=am69%20sk&tisearch=universal_search
>>>>> Silex WiFi card SX-PCEBE:
>>>>> https://www.silextechnology.com/connectivity-solutions/embedded-wireless/sx-pcebe
>>>>> TI Linux Repo:
>>>>> https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/?h=ti-linux-6.6.y
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Regards,
>>>>> Parth P
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
>
More information about the ath12k
mailing list