[RFC PATCH v3 2/3] PCI: rockchip-host: Retry link training on failure without PERST#
Geraldo Nascimento
geraldogabriel at gmail.com
Thu Jul 17 20:33:05 PDT 2025
On Fri, Jul 18, 2025 at 09:55:42AM +0800, Shawn Lin wrote:
> Hi Geraldo,
>
> 在 2025/06/11 星期三 3:05, Geraldo Nascimento 写道:
> > After almost 30 days of battling with RK3399 buggy PCIe on my Rock Pi
> > N10 through trial-and-error debugging, I finally got positive results
> > with enumeration on the PCI bus for both a Realtek 8111E NIC and a
> > Samsung PM981a SSD.
> >
> > The NIC was connected to a M.2->PCIe x4 riser card and it would get
> > stuck on Polling.Compliance, without breaking electrical idle on the
> > Host RX side. The Samsung PM981a SSD is directly connected to M.2
> > connector and that SSD is known to be quirky (OEM... no support)
> > and non-functional on the RK3399 platform.
> >
> > The Samsung SSD was even worse than the NIC - it would get stuck on
> > Detect.Active like a bricked card, even though it was fully functional
> > via USB adapter.
> >
> > It seems both devices benefit from retrying Link Training if - big if
> > here - PERST# is not toggled during retry.
> >
>
> I didn't see this error before especially given RTL8111 NIC is widelly
> used by customers.
Hi Shawn, great to hear from you!
Notice that my board exposes PCIe only via NVMe connector, and not
directly via a proper PCIe connector, so it is necessary for me to
adapt with inexpensive riser card that exposes proper PCIe connector.
I say this because while I don't doubt that the RTL8111 NIC works
out-of-the-box for boards that directly expose PCIe connector, the
combination of riser card plus NIC has a similar effect - though not
entirely equal, as described above - of connecting known good SSDs
that simply refuse to work with Rockchip-IP PCIe.
I admit that patch 1 looks a little crazy, but is has the effect of
enabling use of presently non-working devices or combination of devices
on this IP, at least on the board I have access to.
>
> Could you help tried this?
> [1] apply your patch 3 first
Sure, I'm always open for testing, but could you clarify the patch 3
part? AFAIK this series of mine only has 2 patches, so I'm a little
confused about exactly which patch to apply as a preliminary step.
Also, since you're asking me to test some code, I think it is only fair
if I ask you to test my code, too. It shouldn't be too hard for you to
find a otherwise working NVMe SSD that refuses to complete link training
with current code. Connect this SSD please to a RK3399 board and let us
know if my proposed code change does anything to ameliorate the
long-standing issue of SSD that refuses to cooperate.
Thank you,
Geraldo Nascimento
More information about the linux-arm-kernel
mailing list