PCI trouble on mvebu (Turris Omnia)

Toke Høiland-Jørgensen toke at redhat.com
Sat Oct 31 08:49:49 EDT 2020


"™֟☻̭҇ Ѽ ҉ ®" <vtolkm at googlemail.com> writes:

> On 30/10/2020 15:23, Pali Rohár wrote:
>> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>>> Pali Rohár <pali at kernel.org> writes:
>>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>>>>
>>>> Link training in GEN2 mode for this card succeed only once after reset.
>>>> Repeated link retraining fails and it fails even when aardvark is
>>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>>>> working link training.
>>>>
>>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>>>> success read "negotiated link speed" from "Link Control Status Register"
>>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>>>>
>>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>>>> training fails. And if I change mode to GEN1 after this failed link
>>>> training then nothing happen, link training do not success.
>>>>
>>>> So just speculation now... In current setup initialization of card does
>>>> one link training at GEN2. Then aspm.c is called which is doing second
>>>> link retraining at GEN2. And if it fails then below patch issue third
>>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>>>> this issue.
>>>>
>>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>>>> training?
>>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>>> bit lost in the soup of registers ;)
>>>
>>> So if one of you can cook up a patch, that would be most helpful!
>> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
>> about cls (current link speed, that what is used by aardvark). It is
>> untested, I just tried to compile it.
>>
>> Can try it?
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index 253c30cc1967..f934c0b52f41 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>   	unsigned long end_jiffies;
>>   	u16 reg16;
>>   
>> +	u32 lnkcap2;
>> +	u16 lnksta, lnkctl2, cls, tls;
>> +
>> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
>> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
>> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
>> +
>> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
>> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
>> +		lnksta, cls,
>> +		lnkctl2, tls);
>> +
>> +	tls = 1;
>> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
>> +					PCI_EXP_LNKCTL2_TLS, tls);
>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
>> +		lnkctl2, tls);
>> +
>>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>>   	reg16 |= PCI_EXP_LNKCTL_RL;
>>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>   			break;
>>   		msleep(1);
>>   	} while (time_before(jiffies, end_jiffies));
>> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
>> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
>>   }
>>   
>
> Still exhibiting the BAR update error, run tested with next--20201030

Yup, same for me :(

-Toke




More information about the linux-arm-kernel mailing list