PCI trouble on mvebu (Turris Omnia)

Pali Rohár pali at kernel.org
Mon Mar 29 18:09:29 BST 2021


On Friday 26 March 2021 18:51:42 Toke Høiland-Jørgensen wrote:
> Pali Rohár <pali at kernel.org> writes:
> > On Friday 26 March 2021 17:54:38 Toke Høiland-Jørgensen wrote:
> >> So we have these
> >> cases:
> >> 
> >> ASPM disabled:          ath9k, ath10k and mt76 cards all work
> >> ASPM enabled, no patch: only mt76 card works
> >> ASPM enabled + patch:   ath10k and mt76 cards work
> >> 
> >> So IDK, maybe the ath9k card needs a quirk as well? Or the mvebu board
> >> is just generally flaky?
> >
> > I'm not sure. Maybe ASPM is somehow buggy on ath9k or needs some special
> > handling. But issue is not at PCI config space as ath9k driver start
> > initialization of this card. Needs also some debugging in ath9k driver
> > if it prints that strange "mac chip rev" error.
> 
> Well that's just being output because it gets a revision that it doesn't
> recognise - which it seems to be just reading from a register:
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/net/wireless/ath/ath9k/hw.c#L255
> 
> The value returned is consistent with the value returned just being
> 0xffffffff. Which from looking at ioread32() is the value being returned
> on a failed read. So there's a driver bug there - the check against -EIO
> here is obviously nonsensical:
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/net/wireless/ath/ath9k/hw.c#L290
> 
> But the underlying cause appears to be that the read from the register
> fails, which I suppose is related to something the PCI bus does?
> 
> > I think this issue should be handled separately. Could you report it
> > also to ath9k mailing list (and CC me)? Maybe other ath developers would
> > know some more details.
> 
> I'll send a patch for the nonsensical check above, but other than that I
> think we're still in PCI land here, or?

First, can you try to enable my quirk also for this ath9k card with ASPM
enabled?

I have there another ath9k card which after toggling link retraining
changes PCI device ID (really!) to 0xABCD. But lspci ...

There is long story about broken ath9k cards that are reporting 0xABCD
id on x86 machines with specific BIOS versions. It can be find in
ath9k-devel mailing list archive:

https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07529.html

Maybe we now found root cause of this ABCD? If yes, then it also answers
why above ath9k driver check fails (device id was changed) and also
because kernel see correct id (kernel reads id before configuring ASPM
and therefore before triggering link retraining).

> >> > Can you send PCI device id of your ath9k card (lspci -nn)? Because all
> >> > my tested ath9k cards have different PCI device id.
> >> 
> >> [root at omnia-arch ~]# lspci -nn
> >> 00:01.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:6820] (rev 04)
> >> 00:02.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:6820] (rev 04)
> >> 00:03.0 PCI bridge [0604]: Marvell Technology Group Ltd. Device [11ab:6820] (rev 04)
> >> 01:00.0 Network controller [0280]: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) [168c:002e] (rev 01)
> >> 02:00.0 Network controller [0280]: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter [168c:003c]
> >
> > That is fine. Also all ath9k testing cards have id 0x002e.

Today I found out that lspci -nn may lie! Please send output from
command: lspci -nn -x because real PCI device id can read only from -x
hexdump output.

> >> >> When booting with the
> >> >> patch applied, I get this in dmesg:
> >> >> 
> >> >> [    3.556599] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver



More information about the linux-arm-kernel mailing list