Poll service ready completion by default to avoid warning `failed to receive service ready completion, polling..`?
Baochen Qiang
baochen.qiang at oss.qualcomm.com
Tue Jul 22 02:38:15 PDT 2025
On 7/22/2025 4:37 PM, Paul Menzel wrote:
> Dear Linux folks,
>
>
> Today, on the Intel Kaby Lake laptop Dell XPS 13 9360 with
>
> $ lspci -nn -s 3a:
> 3a:00.0 Network controller [0280]: Qualcomm Atheros QCA6174 802.11ac Wireless Network
> Adapter [168c:003e] (rev 32)
>
> resuming from ACPI S3 took longer, as it sometimes does, and looking into this, I see
> `failed to receive service ready completion, polling..` after a delay of five seconds:
>
> ```
> [ 0.000000] Linux version 6.16.0-rc6-00253-g4871b7cb27f4
> (build at bohemianrhapsody.molgen.mpg.de) (gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU
> Binutils for Debian) 2.44) #90 SMP PREEMPT_DYNAMIC Sat Jul 19 08:53:39 CEST 2025
> […]
> [ 8.588020] abreu kernel: ath10k_pci 0000:3a:00.0: qca6174 hw3.2 target 0x05030000
> chip_id 0x00340aff sub 1a56:1535
> [ 8.588372] abreu kernel: ath10k_pci 0000:3a:00.0: kconfig debug 0 debugfs 0 tracing 0
> dfs 0 testmode 0
> [ 8.588603] abreu kernel: ath10k_pci 0000:3a:00.0: firmware ver WLAN.RM.4.4.1-00309-
> api 6 features wowlan,ignore-otp,mfp crc32 0793bcf2
> […]
> [ 9.113550] Bluetooth: hci0: QCA: patch rome 0x302 build 0x3e8, firmware rome 0x302
> build 0x111
> […]
> [41804.953487] PM: suspend entry (deep)
> [41804.988361] Filesystems sync: 0.034 seconds
> [41805.007216] Freezing user space processes
> [41805.009650] Freezing user space processes completed (elapsed 0.002 seconds)
> [41805.009663] OOM killer disabled.
> [41805.009666] Freezing remaining freezable tasks
> [41805.011383] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [41805.011502] printk: Suspending console(s) (use no_console_suspend to debug)
> [41805.523883] ACPI: EC: interrupt blocked
> [41805.545779] ACPI: PM: Preparing to enter system sleep state S3
> [41805.556040] ACPI: EC: event blocked
> [41805.556045] ACPI: EC: EC stopped
> [41805.556046] ACPI: PM: Saving platform NVS memory
> [41805.559408] Disabling non-boot CPUs ...
> [41805.562480] smpboot: CPU 3 is now offline
> [41805.567105] smpboot: CPU 2 is now offline
> [41805.572122] smpboot: CPU 1 is now offline
> [41805.582034] ACPI: PM: Low-level resume complete
> [41805.582079] ACPI: EC: EC started
> [41805.582080] ACPI: PM: Restoring platform NVS memory
> [41805.583986] Enabling non-boot CPUs ...
> [41805.584009] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [41805.584734] CPU1 is up
> [41805.584749] smpboot: Booting Node 0 Processor 2 APIC 0x1
> [41805.585514] CPU2 is up
> [41805.585530] smpboot: Booting Node 0 Processor 3 APIC 0x3
> [41805.586216] CPU3 is up
> [41805.589070] ACPI: PM: Waking up from system sleep state S3
> [41805.623652] ACPI: EC: interrupt unblocked
> [41805.640074] ACPI: EC: event unblocked
> [41805.651951] nvme nvme0: 4/0/0 default/read/poll queues
> [41805.865391] atkbd serio0: Failed to deactivate keyboard on isa0060/serio0
> [41810.933639] ath10k_pci 0000:3a:00.0: failed to receive service ready completion, polling..
> [41810.933769] ath10k_pci 0000:3a:00.0: service ready completion received, continuing
> normally
> [41810.986330] OOM killer enabled.
> [41810.986332] Restarting tasks: Starting
> […]
> ```
>
> Commit e57b7d62a1b2 (wifi: ath10k: poll service ready message before failing) [1][2],
> present since Linux v6.10-rc1, added this to avoid the hardware not being initialized:
>
> time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
>
> WMI_SERVICE_READY_TIMEOUT_HZ);
> if (!time_left) {
> /* Sometimes the PCI HIF doesn't receive interrupt
> * for the service ready message even if the buffer
> * was completed. PCIe sniffer shows that it's
> * because the corresponding CE ring doesn't fires
> * it. Workaround here by polling CE rings once.
> */
> ath10k_warn(ar, "failed to receive service ready completion, polling..\n");
>
> for (i = 0; i < CE_COUNT; i++)
> ath10k_hif_send_complete_check(ar, i, 1);
>
> time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
>
> WMI_SERVICE_READY_TIMEOUT_HZ);
> if (!time_left) {
> ath10k_warn(ar, "polling timed out\n");
> return -ETIMEDOUT;
> }
>
> ath10k_warn(ar, "service ready completion received, continuing normally\n");
> }
>
> The comment says, it’s a hardware issue. I guess from the Qualcomm device and not the
> board design, as it happens with several devices like James’?
>
> Anyway, should polling be used by default then to avoid the delay?
Adding additional polling before wait seems OK to me
>
> Additionally I have two questions regarding the code:
>
> 1. Is `WMI_SERVICE_READY_TIMEOUT_HZ` the right value to pass to
> `wait_for_completion_timeout(struct completion *done, unsigned long timeout)`?
>
> The macro is defined as:
>
> drivers/net/wireless/ath/ath10k/wmi.h:#define WMI_SERVICE_READY_TIMEOUT_HZ (5 * HZ)
>
> `timeout` is supposed to be in jiffies, and `CONFIG_HZ_250=y` on my system. I wonder how
> that amounts to five seconds on my system.
HZ is defined as jiffies per second, so 5 * HZ equals 5 seconds.
>
> The timeout should probably be defined in seconds? Does the WMI specification say
> something about this?
>
> 2. Is the task interruptable and should `wait_for_completion_interruptible_timeout(struct
> completion *done, unsigned long timeout)` be used?
While I am not sure for now, may I ask why the question?
>
>
> Kind regards,
>
> Paul
>
>
> [1]: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?
> id=e57b7d62a1b2f496caf0beba81cec3c90fad80d5
> [2]: https://lore.kernel.org/all/20240227030409.89702-1-quic_bqiang@quicinc.com/
>
More information about the ath10k
mailing list