[RFC PATCH v3 2/3] PCI: rockchip-host: Retry link training on failure without PERST#

Manivannan Sadhasivam mani at kernel.org
Mon Jun 23 04:29:46 PDT 2025


On Tue, Jun 10, 2025 at 04:05:40PM -0300, Geraldo Nascimento wrote:
> After almost 30 days of battling with RK3399 buggy PCIe on my Rock Pi
> N10 through trial-and-error debugging, I finally got positive results
> with enumeration on the PCI bus for both a Realtek 8111E NIC and a
> Samsung PM981a SSD.
> 
> The NIC was connected to a M.2->PCIe x4 riser card and it would get
> stuck on Polling.Compliance, without breaking electrical idle on the
> Host RX side. The Samsung PM981a SSD is directly connected to M.2
> connector and that SSD is known to be quirky (OEM... no support)
> and non-functional on the RK3399 platform.
> 
> The Samsung SSD was even worse than the NIC - it would get stuck on
> Detect.Active like a bricked card, even though it was fully functional
> via USB adapter.
> 
> It seems both devices benefit from retrying Link Training if - big if
> here - PERST# is not toggled during retry.
> 
> For retry to work, flow must be exactly as handled by present patch,
> that is, we must cut power, disable the clocks, then re-enable
> both clocks and power regulators and go through initialization
> without touching PERST#. Then quirky devices are able to sucessfully
> enumerate.
> 

This sounds weird. PERST# is just an indication to the device that the power and
refclk are applied or going to be removed. The devices uses PERST# to prepare
for the power removal during assert and start functioning after deassert.

It looks like the PERST# polarity is inverted in your case. Could you please
change the 'ep-gpios' polarity to GPIO_ACTIVE_LOW and see if it fixes the issue
without this patch?

If that didn't work, could you please drop the 'ep-gpios' property and check?

> No functional change intended for already working devices.
> 
> Signed-off-by: Geraldo Nascimento <geraldogabriel at gmail.com>
> ---
>  drivers/pci/controller/pcie-rockchip-host.c | 47 ++++++++++++++++++---
>  1 file changed, 40 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-rockchip-host.c b/drivers/pci/controller/pcie-rockchip-host.c
> index 2a1071cd3241..67b3b379d277 100644
> --- a/drivers/pci/controller/pcie-rockchip-host.c
> +++ b/drivers/pci/controller/pcie-rockchip-host.c
> @@ -338,11 +338,14 @@ static int rockchip_pcie_set_vpcie(struct rockchip_pcie *rockchip)
>  static int rockchip_pcie_host_init_port(struct rockchip_pcie *rockchip)
>  {
>  	struct device *dev = rockchip->dev;
> -	int err, i = MAX_LANE_NUM;
> +	int err, i = MAX_LANE_NUM, is_reinit = 0;
>  	u32 status;
>  
> -	gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
> +	if (!is_reinit) {
> +		gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
> +	}
>  
> +reinit:

So this reinit part only skips the PERST# assert, but calls
rockchip_pcie_init_port() which resets the Root Port including PHY. I don't
think it is safe to do it if PERST# is wired.

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the Linux-rockchip mailing list