Recent driver changes destabilized QCA9377 connection quality

Mohammed Shafi Shajakhan mohammed at codeaurora.org
Sun Feb 26 20:24:05 PST 2017


Hi,

On Sun, Feb 26, 2017 at 04:40:42PM +0100, Tobias Predel wrote:
> Hello,
> 
> as I still encounter some stability issues with my beloved QCA9377 chipset on Linux, I decided to recompile the ath10k kernel module according to [1] in order to enable debugging and tracing (see [2]) instead of bisecting the commits because I don't have the ressource to do that in sufficient time.

[shafi] good to mention the stability issues in more details

also good to play around with backports as well

http://drvbp1.linux-foundation.org/~mcgrof/rel-html/backports/ (if you want to
try between stable releases)
https://wireless.wiki.kernel.org/en/users/drivers/ath10k/backports
http://buildbot.w1.fi/backports-wireless-testing/ (latest)

> 
> Boundary conditions (additional information is provided below):
> 
> - Kernel version: 4.9.11
> - Firmware files from https://github.com/kvalo/ath10k-firmware (the untested one just cause a firmware crash so I guess that one is for another type) 
> 
> My kernel ring puffer is flooded with two interesting types of messages and that is why I wanted to ask if these indicate "normal" behaviour or not.
> 
> (1) PCI related messages to and fro all over the time
> 
> [  872.134559] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.134572] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.134585] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.134598] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.134614] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.134629] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.134645] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.134657] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.134670] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.195874] ath10k_pci 0000:02:00.0: pci ps timer refcount 0 awake 1
> [  872.195891] ath10k_pci 0000:02:00.0: pci ps sleep reg refcount 0 awake 1
> [  872.337625] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 0
> [  872.337644] ath10k_pci 0000:02:00.0: pci ps wake reg refcount 0 awake 0
> [  872.337707] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.337727] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.337739] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.337749] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.337758] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.337768] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> [  872.337781] ath10k_pci 0000:02:00.0: pci ps sleep refcount 1 awake 1
> [  872.337791] ath10k_pci 0000:02:00.0: pci ps wake refcount 0 awake 1
> ...

[shafi] if you suspect PCI PS, you can try to disable it
ath10k: disable PCI PS for QCA988X and QCA99X0
https://patchwork.kernel.org/patch/7277361/

but if this helps, its not a regression

> 
> (2) Many FCS (frame check sequence) errors paired with "len 0"?
> 
> [  872.338194] ath10k_pci 0000:02:00.0: rx skb ffff88003765ec00 len 0 peer 00:XX:XX:XX:XX:88  mcast sn 1470 legacy rate_idx 0 vht_nss 0 freq 2447 band 0 flag 0x1200020 fcs-err 1 mic-err 0 amsdu-more 0
> [  872.952364] ath10k_pci 0000:02:00.0: rx skb ffff880011356400 len 0 peer 00:XX:XX:XX:XX:88  mcast sn 1476 legacy rate_idx 0 vht_nss 0 freq 2447 band 0 flag 0x1200020 fcs-err 1 mic-err 0 amsdu-more 0
> [  873.259564] ath10k_pci 0000:02:00.0: rx skb ffff88001ce2fd00 len 0 peer 00:XX:XX:XX:XX:88  mcast sn 1479 legacy rate_idx 0 vht_nss 0 freq 2447 band 0 flag 0x1200020 fcs-err 1 mic-err 0 amsdu-more 0
> [  874.181195] ath10k_pci 0000:02:00.0: rx skb ffff88004b26be00 len 0 peer 00:XX:XX:XX:XX:88  mcast sn 1488 legacy rate_idx 0 vht_nss 0 freq 2447 band 0 flag 0x1200020 fcs-err 1 mic-err 0 amsdu-more 0
> [  874.488037] ath10k_pci 0000:02:00.0: rx skb ffff88004b26bb00 len 0 peer 00:XX:XX:XX:XX:88  mcast sn 1491 legacy rate_idx 0 vht_nss 0 freq 2447 band 0 flag 0x1200020 fcs-err 1 mic-err 0 amsdu-more 0
> [  874.795505] ath10k_pci 0000:02:00.0: rx skb ffff880037856e00 len 0 peer 00:XX:XX:XX:XX:88  mcast sn 1496 legacy rate_idx 0 vht_nss 0 freq 2447 band 0 flag 0x1200020 fcs-err 1 mic-err 0 amsdu-more 0
> ...

[shafi] These logs will flood your console, make sure the flooding of the same
does not causes more issues, i would first recommend to enable fewer debug lots
at first and then increment it progressively

> 
> (3) Some probably (?) ordinary messages in minority like
> [  873.566518] ath10k_pci 0000:02:00.0: pci rx ce pipe 5 len 60
> [  873.566529] ath10k_pci 0000:02:00.0: htt rx, msg_type: 0x1
> ...
> 
> I would really appreciate if someone could just confirm whether the debug messages hint to firmware/driver issues or not. Thanks for your help!
> 
> Some additional information:
> 
> - iwconfig wlp2s0 power off/off doesn't change a lot, the same messages from above keep appearing
> 
> - PCI related (lspci -vv):
> 02:00.0 Network controller: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter (rev 30)
> 	Subsystem: AzureWave Device 2231
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 312
> 	Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=2M]
> 	Capabilities: [40] Power Management version 3
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [50] MSI: Enable+ Count=1/8 Maskable+ 64bit-
> 		Address: fee0f00c  Data: 4143
> 		Masking: 000000fe  Pending: 00000000
> 	Capabilities: [70] Express (v2) Endpoint, MSI 00
> 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
> 			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
> 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
> 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us
> 			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> 			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR+, OBFF Via message
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
> 		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> 		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
> 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 	Capabilities: [100 v2] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
> 	Capabilities: [148 v1] Virtual Channel
> 		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
> 		Arb:	Fixed- WRR32- WRR64- WRR128-
> 		Ctrl:	ArbSelect=Fixed
> 		Status:	InProgress-
> 		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> 			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> 			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
> 			Status:	NegoPending- InProgress-
> 	Capabilities: [168 v1] Device Serial Number 00-00-00-00-00-00-00-00
> 	Capabilities: [178 v1] Latency Tolerance Reporting
> 		Max snoop latency: 15360ns
> 		Max no snoop latency: 15360ns
> 	Capabilities: [180 v1] L1 PM Substates
> 		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> 			  PortCommonModeRestoreTime=50us PortTPowerOnTime=10us
> 		L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> 			   T_CommonMode=0us LTR1.2_Threshold=163840ns
> 		L1SubCtl2: T_PwrOn=10us
> 	Kernel driver in use: ath10k_pci
> 	Kernel modules: ath10k_pci
> 
> - Some ethtool statistics (ethtool -S <device>):
> NIC statistics:
>      rx_packets: 92152
>      rx_bytes: 55156501
>      rx_duplicates: 74
>      rx_fragments: 65368
>      rx_dropped: 421
>      tx_packets: 19775
>      tx_bytes: 2283053
>      tx_filtered: 0
>      tx_retry_failed: 31
>      tx_retries: 0
>      sta_state: 4
>      txrate: 1000000
>      rxrate: 54000000
>      signal: 185
>      channel: 2447
>      noise: 18446744073709551615
>      ch_time: 18446744073709551615
>      ch_time_busy: 18446744073709551615
>      ch_time_ext_busy: 18446744073709551615
>      ch_time_rx: 18446744073709551615
>      ch_time_tx: 18446744073709551615
>      tx_pkts_nic: 36194
>      tx_bytes_nic: 0
>      rx_pkts_nic: 52893
>      rx_bytes_nic: 0
>      d_noise_floor: 18446744073709551503
>      d_cycle_count: 3234369886
>      d_phy_error: 8
>      d_rts_bad: 0
>      d_rts_good: 0
>      d_tx_power: 38
>      d_rx_crc_err: 0
>      d_no_beacon: 0
>      d_tx_mpdus_queued: 20978
>      d_tx_msdu_queued: 20978
>      d_tx_msdu_dropped: 0
>      d_local_enqued: 1205
>      d_local_freed: 1205
>      d_tx_ppdu_hw_queued: 36194
>      d_tx_ppdu_reaped: 36194
>      d_tx_fifo_underrun: 0
>      d_tx_ppdu_abort: 0
>      d_tx_mpdu_requed: 15216
>      d_tx_excessive_retries: 15247
>      d_tx_hw_rate: 0
>      d_tx_dropped_sw_retries: 0
>      d_tx_illegal_rate: 0
>      d_tx_continuous_xretries: 0
>      d_tx_timeout: 0
>      d_tx_mpdu_txop_limit: 0
>      d_pdev_resets: 3
>      d_rx_mid_ppdu_route_change: 0
>      d_rx_status: 83552
>      d_rx_extra_frags_ring0: 0
>      d_rx_extra_frags_ring1: 43
>      d_rx_extra_frags_ring2: 6
>      d_rx_extra_frags_ring3: 0
>      d_rx_msdu_htt: 52894
>      d_rx_mpdu_htt: 52893
>      d_rx_msdu_stack: 26796
>      d_rx_mpdu_stack: 26796
>      d_rx_phy_err: 0
>      d_rx_phy_err_drops: 2
>      d_rx_mpdu_errors: 18097
>      d_fw_crash_count: 0
>      d_fw_warm_reset_count: 3
>      d_fw_cold_reset_count: 3
> 
> - firmware (ethtool -i <device>)
> driver: ath10k_pci
> version: 4.9.11-1-ARCH
> firmware-version: WLAN.TF.1.0-00267-1
> expansion-rom-version: 
> bus-info: 0000:02:00.0
> supports-statistics: yes
> supports-test: no
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: no
> 
> - modinfo ath10k_core
> filename:       /lib/modules/4.9.11-1-ARCH/kernel/drivers/net/wireless/ath/ath10k/ath10k_core.ko.gz
> license:        Dual BSD/GPL
> description:    Core module for Qualcomm Atheros 802.11ac wireless LAN cards.
> author:         Qualcomm Atheros
> depends:        mac80211,cfg80211,ath
> vermagic:       4.9.11-ARCH SMP preempt mod_unload modversions 
> parm:           debug_mask:Debugging mask (uint)
> parm:           uart_print:Uart target debugging (bool)
> parm:           skip_otp:Skip otp failure for calibration in testmode (bool)
> parm:           cryptmode:Crypto mode: 0-hardware, 1-software (uint)
> parm:           rawmode:Use raw 802.11 frame datapath (bool)
> 
> - modinfo ath10k_pci
> filename:       /lib/modules/4.9.11-1-ARCH/kernel/drivers/net/wireless/ath/ath10k/ath10k_pci.ko.gz
> firmware:       ath10k/QCA9377/hw1.0/board.bin
> firmware:       ath10k/QCA9377/hw1.0/firmware-5.bin
> firmware:       ath10k/QCA6174/hw3.0/board-2.bin
> firmware:       ath10k/QCA6174/hw3.0/board.bin
> firmware:       ath10k/QCA6174/hw3.0/firmware-5.bin
> firmware:       ath10k/QCA6174/hw3.0/firmware-4.bin
> firmware:       ath10k/QCA6174/hw2.1/board-2.bin
> firmware:       ath10k/QCA6174/hw2.1/board.bin
> firmware:       ath10k/QCA6174/hw2.1/firmware-5.bin
> firmware:       ath10k/QCA6174/hw2.1/firmware-4.bin
> firmware:       ath10k/QCA9887/hw1.0/board-2.bin
> firmware:       ath10k/QCA9887/hw1.0/board.bin
> firmware:       ath10k/QCA9887/hw1.0/firmware-5.bin
> firmware:       ath10k/QCA988X/hw2.0/board-2.bin
> firmware:       ath10k/QCA988X/hw2.0/board.bin
> firmware:       ath10k/QCA988X/hw2.0/firmware-5.bin
> firmware:       ath10k/QCA988X/hw2.0/firmware-4.bin
> firmware:       ath10k/QCA988X/hw2.0/firmware-3.bin
> firmware:       ath10k/QCA988X/hw2.0/firmware-2.bin
> license:        Dual BSD/GPL
> description:    Driver support for Qualcomm Atheros 802.11ac WLAN PCIe/AHB devices
> author:         Qualcomm Atheros
> alias:          pci:v0000168Cd00000050sv*sd*bc*sc*i*
> alias:          pci:v0000168Cd00000042sv*sd*bc*sc*i*
> alias:          pci:v0000168Cd00000046sv*sd*bc*sc*i*
> alias:          pci:v0000168Cd00000056sv*sd*bc*sc*i*
> alias:          pci:v0000168Cd00000040sv*sd*bc*sc*i*
> alias:          pci:v0000168Cd0000003Esv*sd*bc*sc*i*
> alias:          pci:v0000168Cd00000041sv*sd*bc*sc*i*
> alias:          pci:v0000168Cd0000003Csv*sd*bc*sc*i*
> depends:        ath10k_core
> vermagic:       4.9.11-ARCH SMP preempt mod_unload modversions 
> parm:           irq_mode:0: auto, 1: legacy, 2: msi (default: 0) (uint)
> parm:           reset_mode:0: auto, 1: warm only (default: 0) (uint)
> 
> Sincerely yours,
> 
> Tobias
> 
> [1] https://wiki.archlinux.org/index.php/Compile_kernel_module
> [2] https://wireless.wiki.kernel.org/en/users/drivers/ath10k/debug (I introduced 0xffffff3f as value for debug_mask via the sysfs interface)
> 
> -- 
> Tobias Predel
> currently studying Transportation Engineering B.Sc.
> at Stuttgart University
> 
> 
> _______________________________________________
> ath10k mailing list
> ath10k at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k



More information about the ath10k mailing list