ath12k WCN7850: Q6 Hexagon fault at WLAON region 0x1792000 ~2s post-AUTHORIZE on X1E80100

Baochen Qiang baochen.qiang at oss.qualcomm.com
Wed May 13 18:55:30 PDT 2026



On 5/14/2026 4:47 AM, Marcus Glocker wrote:
> On Wed, May 13, 2026 at 01:26:50PM +0200, Marcus Glocker wrote:
> 
>> On Wed, May 13, 2026 at 11:05:05AM +0800, Baochen Qiang wrote:
>>
>>>
>>>
>>> On 5/13/2026 3:59 AM, Marcus Glocker wrote:
>>>> On Tue, May 12, 2026 at 11:38:06AM +0800, Baochen Qiang wrote:
>>>>
>>>>>
>>>>>
>>>>> On 5/5/2026 5:08 AM, Marcus Glocker wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We're porting ath12k to OpenBSD as the qwz(4) driver, targeting Samsung
>>>>>> Galaxy Book4 Edge (X1E80100 SoC, WCN7850 hw2.0).  Scan, auth, 4-way
>>>>>> handshake all complete; ~2 seconds after WPA2 AUTHORIZE the WCN7850
>>>>>> firmware crashes deterministically with:
>>>>>>
>>>>>> 	dlpager_main.c:147 Non Page Fault Exception cause code 0x 23
>>>>>> 	at Address: 0x 1792000
>>>>>>
>>>>>> Cause code 0x23 isn't a valid arm64 exception -- the fault is on the
>>>>>> WCN7850's on-die Hexagon Q6 DSP, with QURT's generic exception handler
>>>>>> (which happens to live in dlpager_main.c) printing it.  So this is not
>>>>>> a host CPU fault.
>>>>>>
>>>>>> Per the RDDM segment table (at the start of the dump), VA 0x01792000
>>>>>> is the start of the chip's WLAON_DUMP region (size 0x820).  The Q6 is
>>>>>> trying to read its own always-on hardware state region and the chip
>>>>>> refuses the access.
>>>>>>
>>>>>> (Samsung, Asus, Honor) with multiple FW builds.  Currently testing
>>>>>> with WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3
>>>>>> (fw 0x110cffff, 2025-06-25) -- the exact blob a Linux ath12k user
>>>>>> runs successfully on the identical Samsung hardware.  Same board-2.bin,
>>>>>> same compiled DTB (upstream hamoa.dtsi based).
>>>>>>
>>>>>> We've field-compared qwz against ath12k and ruled out (byte-level or
>>>>>> wire-level):
>>>>>>
>>>>>>   * QMI host_cap, m3_info, wlan_cfg, wlan_ini, bdf_download (all
>>>>>>     fields including ce_config, svc_to_ce_map, shadow_reg_v3,
>>>>>>     feature_list, m3 paddr/size, nm_modem)
>>>>>>   * MHI bringup ordering (BHI -> wait SBL EE -> wait M0 -> BHIE)
>>>>>>   * BHI/BHIE DMA coherency
>>>>>>   * ASPM disable before MHI start
>>>>>>   * WLAON_WARM_SW_ENTRY zeroing + QFPROM_PWR_CTRL VDD4BLOW clear
>>>>>>   * static_window_map=false + window-bank register init
>>>>>>   * Per-chunk vs monolithic respond_mem allocation
>>>>>>   * WMI_PEER_MIMO_PS_STATE = WMI_PEER_SMPS_PS_NONE (added matching
>>>>>>     ath12k_setup_peer_smps; doesn't help)
>>>>>>   * FW image variation (c5 and c7 both fail identically)
>>>>>>
>>>>>> Specifically NOT involved (we have evidence either way):
>>>>>>
>>>>>>   * Gunyah -- X1E80100 is reportedly run in EL2 without Gunyah by
>>>>>>     users where ath12k works; so Gunyah isn't programming WLAON
>>>>>>     access for the Q6.
>>>>>>   * SMMU / pcie_smmu -- pcie_smmu is status="reserved" upstream,
>>>>>>     pcie4 has no iommus property; PCIe DMA bypasses SMMU.
>>>>>>   * SCM/PAS -- ath12k's PCIe path makes no qcom_scm_* calls.
>>>>>>
>>>>>> Question: what subsystem inside the WCN7850 firmware touches the
>>>>>> WLAON region at 0x01792000 around 2 seconds after the host sends
>>>>>> WMI_PEER_AUTHORIZE?  And what host-side configuration (WMI command,
>>>>>> HTT message, MHI state, etc.) primes that path so the access
>>>>>> succeeds on Linux?
>>>>>>
>>>>>> Even a pointer at the right Linux code path or the right FW-side
>>>>>> component would unblock us.  We have full RDDM dumps and dmesg
>>>>>> captures available; happy to share off-list or as attachments.
>>>>>
>>>>> please help collect ath12k successful dmesg log and qwz failed dmesg log for compare.
>>>>>
>>>>> Please enable verbose ath12k log when loading ath12k driver:
>>>>>
>>>>> If you are using the latest upstream ath12k:
>>>>>
>>>>> 	sudo modprobe ath12k debug_mask=0xffffffff
>>>>> 	sudo modprobe ath12k_wifi7
>>>>>
>>>>> If you are using an old ath12k:
>>>>>
>>>>> 	sudo modprobe ath12k debug_mask=0xffffffff
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Marcus
>>>>>>
>>>>>
>>>>
>>>> Hi Baochen,
>>>>                                         
>>>> Thanks for coming back on this topic.
>>>>                                         
>>>> Attached the OpenBSD dmesg, with full ath12k driver debug logging
>>>
>>> the dmesg shows several WMI_INIT cmd instances which is not expected, because in normal
>>> operation this command should be sent only once.
>>>
>>> cat dmesg |grep -w 'sending WMI command 0x1'
>>> May 12 19:35:46 x1e /bsd: qwz_wmi_cmd_send_nowait: sending WMI command 0x1
>>> May 12 19:37:20 x1e /bsd: qwz_wmi_cmd_send_nowait: sending WMI command 0x1
>>> May 12 19:37:41 x1e /bsd: qwz_wmi_cmd_send_nowait: sending WMI command 0x1
>>> May 12 19:37:46 x1e /bsd: qwz_wmi_cmd_send_nowait: sending WMI command 0x1
>>> May 12 19:37:50 x1e /bsd: qwz_wmi_cmd_send_nowait: sending WMI command 0x1
>>>
>>> other than that I don't find any other clues.
>>
>> Yes, that is specific to the OpenBSD NIC framework.  I've just tested
>> a quick hack with which the WMI_INIT cmd only gets issued once, but it
>> makes no difference to the firmware crash.
>>  
>>>> enabled, plus the resulting RDDM binary after the firmware crash:
>>>
>>> how did you collect the RDDM binary, seems not in the right format, my tool can not parse
>>> it correctly. Looking into the binary, at least the magic 'ATH12K-FW-DUMP' is not present
>>> at the very beginning.
>>
>> It looks like ath12k wraps the raw RDDM dump in some ath12k firmware
>> dump structure, which we don't do with our driver.  I did write a small
>> conversion program, trying to generate the dump which you expect.  You
>> can find the converted dump file here:
>>
>> https://nazgul.ch/pub/qwz0-rddm.bin.out.gz
>>
>> I hope you can load that in to your tool.
>>
>>> And from which Linux version you take the ath12k codebase?
>>
>> Well, that is a good question.  qwz (the ath12k OpenBSD driver), is
>> an initial clone of the qwx (the ath11k OpenBSD driver), which is
>> functional.  On top of that we did changes, of which the recent ones
>> did sync missing functionality from the Linux ath12k driver.  We did
>> already do a lot of comparison between qwz and the ath12k driver, but
>> we can't spot an obvious difference which could explain the firmware
>> crash.  Obviously doesn't mean that there isn't a gap between qwz and
>> ath12k related to this issue which we don't see.
>>
>>>>
>>>> https://nazgul.ch/pub/qwz0-rddm.bin.gz
>>>>                                         
>>>> The command sequence on OpenBSD to re-produce that was:
>>>>                                         
>>>> ifconfig qwz0 up                        # Bring the ath12k device up
>>>> ifconfig qwz0 scan                      # Scan for networks
>>>> ifconfig qwz0 nwid nazgul wpakey xxx    # Start association
>>>>                                         
>>>> Hi Max,
>>>>                                         
>>>> Since you have Linux running on exactly the same Samsung Galaxy Book4
>>>> Edge 14" laptop, where ath12k works, would you be so kind and also
>>>> provide the dmesg output showing an successful association with the
>>>> ath12k driver debug logging enabled?  See above how to enable that.
>>>> That would be very helpful!
>>>>                                         
>>>> Thanks and Regards,
>>>> Marcus
>>>
> 
> Hi Baochen,
> 
> I just want to quickly let you know that we did overcome the firmware
> crash.  The culprit was that we did
> 
> 	#define RX_BE_PADDING0_BYTES 80 -> instead of 8
> 
> which did break the hal_rx_desc_wcn7850 struct:
> 
>   struct hal_rx_desc_wcn7850 {
>       u64                          msdu_end_tag;     // offset 0
>       struct rx_msdu_end_qcn9274   msdu_end;         // offset 8
>       u8                           rx_padding0[N];   // <- the bug
>       u64                          mpdu_start_tag;
>       struct rx_mpdu_start_qcn9274 mpdu_start;
>       struct rx_pkt_hdr_tlv        pkt_hdr_tlv;
>       u8                           msdu_payload[];
>   };
> 
> With that fixed, the firmware error is gone, and we can now receive
> and IP from DHCP.  We're working on getting the TX path work next.

OK, good to see it gets fixed!

> 
> Thanks and Regards,
> Marcus




More information about the ath12k mailing list