[RFC PATCH v3 2/3] PCI: rockchip-host: Retry link training on failure without PERST#
Shawn Lin
shawn.lin at rock-chips.com
Thu Jul 17 20:46:33 PDT 2025
在 2025/07/18 星期五 11:33, Geraldo Nascimento 写道:
> On Fri, Jul 18, 2025 at 09:55:42AM +0800, Shawn Lin wrote:
>> Hi Geraldo,
>>
>> 在 2025/06/11 星期三 3:05, Geraldo Nascimento 写道:
>>> After almost 30 days of battling with RK3399 buggy PCIe on my Rock Pi
>>> N10 through trial-and-error debugging, I finally got positive results
>>> with enumeration on the PCI bus for both a Realtek 8111E NIC and a
>>> Samsung PM981a SSD.
>>>
>>> The NIC was connected to a M.2->PCIe x4 riser card and it would get
>>> stuck on Polling.Compliance, without breaking electrical idle on the
>>> Host RX side. The Samsung PM981a SSD is directly connected to M.2
>>> connector and that SSD is known to be quirky (OEM... no support)
>>> and non-functional on the RK3399 platform.
>>>
>>> The Samsung SSD was even worse than the NIC - it would get stuck on
>>> Detect.Active like a bricked card, even though it was fully functional
>>> via USB adapter.
>>>
>>> It seems both devices benefit from retrying Link Training if - big if
>>> here - PERST# is not toggled during retry.
>>>
>>
>> I didn't see this error before especially given RTL8111 NIC is widelly
>> used by customers.
>
> Hi Shawn, great to hear from you!
>
> Notice that my board exposes PCIe only via NVMe connector, and not
> directly via a proper PCIe connector, so it is necessary for me to
> adapt with inexpensive riser card that exposes proper PCIe connector.
>
> I say this because while I don't doubt that the RTL8111 NIC works
> out-of-the-box for boards that directly expose PCIe connector, the
> combination of riser card plus NIC has a similar effect - though not
> entirely equal, as described above - of connecting known good SSDs
> that simply refuse to work with Rockchip-IP PCIe.
>
> I admit that patch 1 looks a little crazy, but is has the effect of
> enabling use of presently non-working devices or combination of devices
> on this IP, at least on the board I have access to.
>
>>
>> Could you help tried this?
>> [1] apply your patch 3 first
>
> Sure, I'm always open for testing, but could you clarify the patch 3
> part? AFAIK this series of mine only has 2 patches, so I'm a little
> confused about exactly which patch to apply as a preliminary step.
Patch 3 refers to "arm64: dts: rockchip: drop PCIe 3v3 always-on and
boot-on" which let kernel fully controller the power in case firmware
did it in advanced.
>
> Also, since you're asking me to test some code, I think it is only fair
> if I ask you to test my code, too. It shouldn't be too hard for you to
> find a otherwise working NVMe SSD that refuses to complete link training
> with current code. Connect this SSD please to a RK3399 board and let us
> know if my proposed code change does anything to ameliorate the
> long-standing issue of SSD that refuses to cooperate.
Sure, I don't have Samsung PM981a SSD now, but I could try to test all
my SSDs to find if I could pick up one that won't work.
>
> Thank you,
> Geraldo Nascimento
>
More information about the linux-arm-kernel
mailing list