PCI trouble on mvebu (Turris Omnia)

Toke Høiland-Jørgensen toke at redhat.com
Thu Mar 18 22:43:58 GMT 2021


Pali Rohár <pali at kernel.org> writes:

> On Monday 15 March 2021 20:58:06 Pali Rohár wrote:
>> On Monday 02 November 2020 16:54:35 Toke Høiland-Jørgensen wrote:
>> > Pali Rohár <pali at kernel.org> writes:
>> > 
>> > > On Saturday 31 October 2020 13:49:49 Toke Høiland-Jørgensen wrote:
>> > >> "™֟☻̭҇ Ѽ ҉ ®" <vtolkm at googlemail.com> writes:
>> > >> 
>> > >> > On 30/10/2020 15:23, Pali Rohár wrote:
>> > >> >> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>> > >> >>> Pali Rohár <pali at kernel.org> writes:
>> > >> >>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>> > >> >>>>
>> > >> >>>> Link training in GEN2 mode for this card succeed only once after reset.
>> > >> >>>> Repeated link retraining fails and it fails even when aardvark is
>> > >> >>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>> > >> >>>> working link training.
>> > >> >>>>
>> > >> >>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>> > >> >>>> success read "negotiated link speed" from "Link Control Status Register"
>> > >> >>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>> > >> >>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>> > >> >>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>> > >> >>>>
>> > >> >>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>> > >> >>>> training fails. And if I change mode to GEN1 after this failed link
>> > >> >>>> training then nothing happen, link training do not success.
>> > >> >>>>
>> > >> >>>> So just speculation now... In current setup initialization of card does
>> > >> >>>> one link training at GEN2. Then aspm.c is called which is doing second
>> > >> >>>> link retraining at GEN2. And if it fails then below patch issue third
>> > >> >>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>> > >> >>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>> > >> >>>> this issue.
>> > >> >>>>
>> > >> >>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>> > >> >>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>> > >> >>>> training?
>> > >> >>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>> > >> >>> bit lost in the soup of registers ;)
>> > >> >>>
>> > >> >>> So if one of you can cook up a patch, that would be most helpful!
>> > >> >> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
>> > >> >> about cls (current link speed, that what is used by aardvark). It is
>> > >> >> untested, I just tried to compile it.
>> > >> >>
>> > >> >> Can try it?
>> > >> >>
>> > >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> > >> >> index 253c30cc1967..f934c0b52f41 100644
>> > >> >> --- a/drivers/pci/pcie/aspm.c
>> > >> >> +++ b/drivers/pci/pcie/aspm.c
>> > >> >> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>> > >> >>   	unsigned long end_jiffies;
>> > >> >>   	u16 reg16;
>> > >> >>   
>> > >> >> +	u32 lnkcap2;
>> > >> >> +	u16 lnksta, lnkctl2, cls, tls;
>> > >> >> +
>> > >> >> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
>> > >> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
>> > >> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> > >> >> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
>> > >> >> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
>> > >> >> +
>> > >> >> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
>> > >> >> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
>> > >> >> +		lnksta, cls,
>> > >> >> +		lnkctl2, tls);
>> > >> >> +
>> > >> >> +	tls = 1;
>> > >> >> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
>> > >> >> +					PCI_EXP_LNKCTL2_TLS, tls);
>> > >> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> > >> >> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
>> > >> >> +		lnkctl2, tls);
>> > >> >> +
>> > >> >>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>> > >> >>   	reg16 |= PCI_EXP_LNKCTL_RL;
>> > >> >>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> > >> >> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>> > >> >>   			break;
>> > >> >>   		msleep(1);
>> > >> >>   	} while (time_before(jiffies, end_jiffies));
>> > >> >> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
>> > >> >> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>> > >> >>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
>> > >> >>   }
>> > >> >>   
>> > >> >
>> > >> > Still exhibiting the BAR update error, run tested with next--20201030
>> > >> 
>> > >> Yup, same for me :(
>> 
>> I'm answering my own question. This code does not work on Omnia because
>> A38x pci-mvebu.c driver is using emulator for PCIe root bridge and it
>> does not implement PCI_EXP_LNKCTL2 and PCI_EXP_LNKCTL2 registers. So
>> code for forcing link speed has no effect on Omnia...
>
> Toke, on A38x PCIe controller it is possible to access PCI_EXP_LNKCTL2
> register. Just access is not exported via emulated root bridge.
>
> Documentation for this PCIe controller is public, so anybody can look at
> register description. See page 571, A.7 PCI Express 2.0 Port 0 Registers
>
> http://web.archive.org/web/20200420191927/https://www.marvell.com/content/dam/marvell/en/public-collateral/embedded-processors/marvell-embedded-processors-armada-38x-functional-specifications-2015-11.pdf
>
> In drivers/pci/controller/pci-mvebu.c you can set a new value for this
> register via function call:
>
>     mvebu_writel(port, val, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
>
> So, could you try to set PCI_EXP_LNKCTL2_TLS bits to gen1 in some hw
> init function, e.g. mvebu_pcie_setup_hw()?
>
>     u32 val = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
>     val &= ~PCI_EXP_LNKCTL2_TLS;
>     val |= PCI_EXP_LNKCTL2_TLS_2_5GT;
>     mvebu_writel(port, val, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);

I pasted this into the top of mvebu_pcie_setup_hw(), and that indeed
seems to fix things so that all three PCIE devices work even with ASPM
turned on! :)

Do you still need me to test the card on a different machine? Not sure I
have an x86 machine with a mini-PCIe slot handy, but I can go hunting if
needed...

-Toke




More information about the linux-arm-kernel mailing list