PCI trouble on mvebu (Turris Omnia)

Toke Høiland-Jørgensen toke at redhat.com
Wed Mar 31 15:02:42 BST 2021


Pali Rohár <pali at kernel.org> writes:

> On Friday 26 March 2021 18:51:42 Toke Høiland-Jørgensen wrote:
>> Pali Rohár <pali at kernel.org> writes:
>> > On Friday 26 March 2021 17:54:38 Toke Høiland-Jørgensen wrote:
>> >> So we have these
>> >> cases:
>> >> 
>> >> ASPM disabled:          ath9k, ath10k and mt76 cards all work
>> >> ASPM enabled, no patch: only mt76 card works
>> >> ASPM enabled + patch:   ath10k and mt76 cards work
>> >> 
>> >> So IDK, maybe the ath9k card needs a quirk as well? Or the mvebu board
>> >> is just generally flaky?
>> >
>> > I'm not sure. Maybe ASPM is somehow buggy on ath9k or needs some special
>> > handling. But issue is not at PCI config space as ath9k driver start
>> > initialization of this card. Needs also some debugging in ath9k driver
>> > if it prints that strange "mac chip rev" error.
>> 
>> Well that's just being output because it gets a revision that it doesn't
>> recognise - which it seems to be just reading from a register:
>> 
>> https://elixir.bootlin.com/linux/latest/source/drivers/net/wireless/ath/ath9k/hw.c#L255
>> 
>> The value returned is consistent with the value returned just being
>> 0xffffffff. Which from looking at ioread32() is the value being returned
>> on a failed read. So there's a driver bug there - the check against -EIO
>> here is obviously nonsensical:
>> 
>> https://elixir.bootlin.com/linux/latest/source/drivers/net/wireless/ath/ath9k/hw.c#L290
>> 
>> But the underlying cause appears to be that the read from the register
>> fails, which I suppose is related to something the PCI bus does?
>> 
>> > I think this issue should be handled separately. Could you report it
>> > also to ath9k mailing list (and CC me)? Maybe other ath developers would
>> > know some more details.
>> 
>> I'll send a patch for the nonsensical check above, but other than that I
>> think we're still in PCI land here, or?
>
> First, can you try to enable my quirk also for this ath9k card with ASPM
> enabled?

Yup, with this I get both devices working:

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 8ff690c7679d..7e2f9c69f6b2 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3583,6 +3583,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset);
  * PCIe bridge has forced link speed to 2.5 GT/s via PCI_EXP_LNKCTL2 register.
  */
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset_and_no_retrain_link);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x002e, quirk_no_bus_reset_and_no_retrain_link);
 
 /*
  * Root port on some Cavium CN8xxx chips do not successfully complete a bus

>
> I have there another ath9k card which after toggling link retraining
> changes PCI device ID (really!) to 0xABCD. But lspci ...
>
> There is long story about broken ath9k cards that are reporting 0xABCD
> id on x86 machines with specific BIOS versions. It can be find in
> ath9k-devel mailing list archive:
>
> https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html
>
> Maybe we now found root cause of this ABCD? If yes, then it also answers
> why above ath9k driver check fails (device id was changed) and also
> because kernel see correct id (kernel reads id before configuring ASPM
> and therefore before triggering link retraining).
>
>> >> > Can you send PCI device id of your ath9k card (lspci -nn)? Because all
>> >> > my tested ath9k cards have different PCI device id.
>> >> 
>> >> [root at omnia-arch ~]# lspci -nn
>> >> 00:01.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:6820] (rev 04)
>> >> 00:02.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:6820] (rev 04)
>> >> 00:03.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:6820] (rev 04)
>> >> 01:00.0 Network controller [0280]: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) [168c:002e] (rev 01)
>> >> 02:00.0 Network controller [0280]: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter [168c:003c]
>> >
>> > That is fine. Also all ath9k testing cards have id 0x002e.
>
> Today I found out that lspci -nn may lie! Please send output from
> command: lspci -nn -x because real PCI device id can read only from -x
> hexdump output.

Without the quirk added to the ath9k:

01:00.0 Network controller [0280]: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) [168c:002e] (rev 01)
00: 8c 16 2e 00 02 00 10 00 01 00 80 02 10 00 00 00
10: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 8c 16 a4 30
30: 00 00 00 00 40 00 00 00 00 00 00 00 3d 01 00 00

02:00.0 Network controller [0280]: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter [168c:003c]
00: 8c 16 3c 00 46 05 10 00 00 00 80 02 10 00 00 00
10: 04 00 20 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 20 ea 40 00 00 00 00 00 00 00 3e 01 00 00

And with:

01:00.0 Network controller [0280]: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) [168c:002e] (rev 01)
00: 8c 16 2e 00 46 01 10 00 01 00 80 02 10 00 00 00
10: 04 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 8c 16 a4 30
30: 00 00 00 00 40 00 00 00 00 00 00 00 3d 01 00 00

02:00.0 Network controller [0280]: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter [168c:003c]
00: 8c 16 3c 00 46 05 10 00 00 00 80 02 10 00 00 00
10: 04 00 20 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 20 ea 40 00 00 00 00 00 00 00 3e 01 00 00



Is that change in bytes 5 and 6 significant?

-Toke




More information about the linux-arm-kernel mailing list