[PATCH v2 2/2] wifi: ath12k: Fix firmware stats leak when pdev list is empty
Saikiran B
bjsaikiran at gmail.com
Thu Jan 29 06:06:17 PST 2026
On Thu, Jan 29, 2026 at 7:57 AM Baochen Qiang
<baochen.qiang at oss.qualcomm.com> wrote:
>
>
>
> On 1/27/2026 12:40 PM, Saikiran B wrote:
> > I have analyzed the logs and code flow in depth to provide more
> > definitive answers for your questions.
> >
> > The log entries showing the failure are:
> > [ 563.574076] ath12k_pci 0004:01:00.0: failed to pull fw stats: -71
> > [ 564.575896] ath12k_pci 0004:01:00.0: time out while waiting for get fw stats
> >
> > 1. Why are other stats populated?
> > The "failed to pull fw stats: -71" error is not the initial failure
> > but a symptom that appears after repeated operations. The leak happens
> > during *successful* calls prior to this error.
> >
> > Code flow proving the leak:
> > - ath12k_mac_get_fw_stats() sends WMI_REQUEST_PDEV_STAT.
> > - Firmware responds. ath12k_update_stats_event() parses the response.
> > - ath12k_wmi_fw_stats_process() is called, which splices 'vdevs' and
> > 'beacon' stats into ar->fw_stats.vdevs/bcn.
> > - ath12k_mac_get_fw_stats() returns 0 (Success).
> > - In ath12k_mac_op_get_txpower(), the check `if (!pdev)` fails if the
> > pdev-specific list is empty (but vdev list is NOT empty).
> > - The function exits via `err_fallback` WITHOUT calling ath12k_fw_stats_reset().
> > - Result: The 'vdev' and 'beacon' stats that were spliced into
> > ar->fw_stats remain there, leaking memory and accumulating with every
> > call.
> >
> > 2. Exact place where -71 is printed:
> > The error "failed to pull fw stats: -71" is printed in
> > [ath12k_update_stats_event()](drivers/net/wireless/ath/ath12k/wmi.c).
> > It corresponds to "ret = ath12k_wmi_pull_fw_stats()" returning -EPROTO.
> > This propagates from
> > [ath12k_wmi_tlv_fw_stats_data_parse()](drivers/net/wireless/ath/ath12k/wmi.c),
> > when buffer validation checks (like `len < sizeof(*src)`) fail.
> >
> > Conclusion:
> > The fix in my patch (resetting stats when `!pdev`) is critical because
> > it ensures that the accumulated 'vdev' and 'beacon' stats are freed
> > even when the 'pdev' list ends up empty.
> >
> > Let me know if you need anything else.
>
> can you please try below to see if it can fix your issue?
>
> https://lore.kernel.org/r/20260129-ath12k-fw-stats-fixes-v1-0-55d66064f4d5@oss.qualcomm.com
>
> >
> > Thanks & Regards,
> > Saikiran
> >
> > On Tue, Jan 27, 2026 at 9:47 AM Saikiran B <bjsaikiran at gmail.com> wrote:
> >>
> >> Hi Baochen,
> >>
> >> Regarding your questions:
> >>
> >> "Are other stats populated?"
> >>
> >> Yes. When ath12k_mac_get_fw_stats() returns success (0), it means the
> >> firmware response was received and valid WMI events were processed.
> >> The firmware response to WMI_REQUEST_PDEV_STAT typically includes
> >> multiple stats TLVs (vdev stats, beacon stats, etc.). Even if the
> >> "pdev stats" list ends up empty (e.g., due to specific filtering or
> >> availability), the firmware should have populated other lists (like
> >> vdevs or beacons) in the ar->fw_stats structure. If we don't reset,
> >> these valid entries leak and accumulate.
> >>
> >> "Where exactly is -71 (EPROTO) printed?"
> >>
> >> The log "failed to pull fw stats: -71" is printed in
> >> ath12k_update_stats_event() (wmi.c line 8500 in my tree). This error
> >> code (-EPROTO) propagates from ath12k_wmi_tlv_fw_stats_data_parse(),
> >> where it is returned when buffer validation checks fail (e.g., if (len
> >> < sizeof(*src))). This failure suggests that the accumulated state or
> >> memory corruption from the leak eventually causes the parser to fail
> >> on subsequent events.
> >>
> >> So, fixing the leak is necessary for correctness regardless of the
> >> specific side-effect error code.
> >>
> >> Thanks & Regards,
> >> Saikiran
> >>
> >> On Tue, Jan 27, 2026 at 8:57 AM Baochen Qiang
> >> <baochen.qiang at oss.qualcomm.com> wrote:
> >>>
> >>>
> >>>
> >>> On 1/26/2026 5:52 PM, Saikiran wrote:
> >>>> The commits bd6ec8111e65 and 2977567b244f changed firmware stats handling
> >>>> to be caller-driven, requiring explicit ath12k_fw_stats_reset() calls
> >>>> after using ath12k_mac_get_fw_stats().
> >>>>
> >>>> In ath12k_mac_op_get_txpower(), when ath12k_mac_get_fw_stats() succeeds
> >>>> but the pdev stats list is empty, the function exits without calling
> >>>> ath12k_fw_stats_reset(). Even though the pdev list is empty, the firmware
> >>>> may have populated other stats lists (vdevs, beacons, etc.) in the
> >>>
> >>> 'may' is not enough, we need to be 100% sure whether other stats are populated. This is
> >>> critical for us to find the root cause.
> >>>
> >>>> ar->fw_stats structure.
> >>>>
> >>>> Without resetting the stats buffer, this data accumulates across multiple
> >>>> calls, eventually causing the stats buffer to overflow and leading to
> >>>> firmware communication failures (error -71/EPROTO) during subsequent
> >>>> operations.
> >>>>
> >>>> The issue manifests during 5GHz scanning which triggers multiple TX power
> >>>> queries. Symptoms include:
> >>>> - "failed to pull fw stats: -71" errors in dmesg
> >>>
> >>> still, can you please check the logs to see at which exact place is this printed?
> >>>
> >>>> - 5GHz networks not detected despite hardware support
> >>>> - 2.4GHz networks work normally
> >>>>
> >>>> Fix by calling ath12k_fw_stats_reset() when the pdev list is empty,
> >>>> ensuring the stats buffer is properly cleaned up even when only partial
> >>>> stats data is received from firmware.
> >>>>
> >>>> Fixes: bd6ec8111e65 ("wifi: ath12k: Make firmware stats reset caller-driven")
> >>>> Link: https://bugs.launchpad.net/ubuntu-concept/+bug/2138308
> >>>> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302 (Lenovo Yoga Slim 7x)
> >>>> Signed-off-by: Saikiran <bjsaikiran at gmail.com>
> >>>> ---
> >>>> drivers/net/wireless/ath/ath12k/mac.c | 1 +
> >>>> 1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
> >>>> index e0e49f782bf8..6e35c3ee9864 100644
> >>>> --- a/drivers/net/wireless/ath/ath12k/mac.c
> >>>> +++ b/drivers/net/wireless/ath/ath12k/mac.c
> >>>> @@ -5169,6 +5169,7 @@ static int ath12k_mac_op_get_txpower(struct ieee80211_hw *hw,
> >>>> struct ath12k_fw_stats_pdev, list);
> >>>> if (!pdev) {
> >>>> spin_unlock_bh(&ar->data_lock);
> >>>> + ath12k_fw_stats_reset(ar);
> >>>> goto err_fallback;
> >>>> }
> >>>>
> >>>
>
Hi Baochen,
I tried applying your patches on top of v6.19-rc7 (which is the latest
mainline release candidate I'm testing on), but I ran into build
issues because some of the dependencies seem missing.
Specifically:
Patch 2 ("wifi: ath12k: fix station lookup failure when disconnecting
from AP") uses `ath12k_link_sta_find_by_addr()`, which does not exist
in my tree. It seems your patches are based on a different tree
(ath-next?) that has newer changes not yet in the mainline.
Could you please point me to the specific git repo/branch you are
using? I can try to build and test on that baseline to be sure.
Regarding the firmware stats issue:
I verified the firmware files match the latest available (MD5 sums
matched), yet the "-71" errors and memory leak persist on my device
without fixes.
I successfully applied the logic from your Patch 1 manually (since
[ath12k_mac_get_target_pdev_id](cci:1://file:///home/saikiran/linux/kernel/x1e/x1e/drivers/net/wireless/ath/ath12k/mac.c:989:0-1008:1)
exists), but I haven't fully validated if it alone resolves the leak
in all scenarios.
However, the fix I proposed in my v2 patch (resetting stats when pdev
list is empty) definitely stops the leak mechanism by ensuring cleanup
happens even when the firmware returns partial stats (which seems to
be the trigger condition).
I'll wait for your pointer to the base tree to do a proper test of your series.
Thanks & Regards,
Saikiran
More information about the ath12k
mailing list