[PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready

Manivannan Sadhasivam manivannan.sadhasivam at linaro.org
Fri May 30 08:59:57 PDT 2025


On Wed, May 28, 2025 at 05:42:51PM -0500, Bjorn Helgaas wrote:
> On Tue, May 06, 2025 at 09:39:36AM +0200, Niklas Cassel wrote:
> > Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> > detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> > and instead enumerate the bus when receiving a Link Up IRQ.
> > 
> > Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> > no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> > dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> > SSD functional again.
> > 
> > It seems that we are enumerating the bus before the endpoint is ready.
> > Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> > threaded IRQ handler makes the SSD functional once again.
> 
> This sounds like a problem that could happen with any controller, not
> just dw-rockchip?  Are we missing some required delay that should be
> in generic code?  Or is this a PLEXTOR defect that everybody has to
> pay the price for?
> 
> Delays like this are really hard to get rid of once we add them, so
> I'm a little bit cautious.
> 

Ok, I digged into the spec a little more and I could see below paragraph in
r6.0, sec 6.6.1 for devices not supporting Device Readiness Status (DRS):

"With a Downstream Port that does not support Link speeds greater than 5.0 GT/s,
software must wait a minimum of 100 ms following exit from a Conventional Reset
before sending a Configuration Request to the device immediately below that
Port.

With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
software must wait a minimum of 100 ms after Link training completes before
sending a Configuration Request to the device immediately below that Port.
Software can determine when Link training completes by polling the Data Link
Layer Link Active bit or by setting up an associated interrupt
(see § Section 6.7.3.3 ). It is strongly recommended for software to use this
mechanism whenever the Downstream Port supports it."

We are not checking for DRS after the PERST# deassert or after link is up, I
think DRS check only applies to enumerated devices, but I'm not 100% sure. But
if we assume that the devices doesn't support DRS, then we should make sure
that all controller drivers wait for 100ms even after link up event before
issuing the config request.

So I don't think this is a device specific issue but rather controller specific.
And this makes the Qcom patch that I dropped a valid one (ofc with change in
description).

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the linux-arm-kernel mailing list