[PATCH v2] PCI: dw-rockchip: Delay link training after hot reset in EP mode

Bjorn Helgaas helgaas at kernel.org
Wed Jun 18 12:59:59 PDT 2025


On Wed, Jun 18, 2025 at 04:04:17PM +0200, Niklas Cassel wrote:
> On Tue, Jun 17, 2025 at 05:01:14PM -0500, Bjorn Helgaas wrote:
> > On Fri, Jun 13, 2025 at 12:19:09PM +0200, Niklas Cassel wrote:
> > > From: Wilfred Mallawa <wilfred.mallawa at wdc.com>
> > > 
> > > RK3588 TRM, section "11.6.1.3.3 Hot Reset and Link-Down Reset" states that:
> > > """
> > > If you want to delay link re-establishment (after reset) so that you can
> > > reprogram some registers through DBI, you must set app_ltssm_enable =0
> > > immediately after core_rst_n as shown in above. This can be achieved by
> > > enable the app_dly2_en, and end-up the delay by assert app_dly2_done.
> > > """
> > >
> > > I.e. setting app_dly2_en will automatically deassert app_ltssm_enable on
> > > a hot reset, and setting app_dly2_done will re-assert app_ltssm_enable,
> > > re-enabling link training.
> > > 
> > > When receiving a hot reset/link-down IRQ when running in EP mode, we will
> > > call dw_pcie_ep_linkdown(), which will call the .link_down() callback in
> > > the currently bound endpoint function (EPF) drivers.
> > > 
> > > The callback in an EPF driver can theoretically take a long time to
> > > complete, so make sure that the link is not re-established until after
> > > dw_pcie_ep_linkdown() (which calls the .link_down() callback(s)
> > > synchronously).
> > 
> > I don't know why we care *how long* EPF callbacks might take.
> 
> Well, because currently, we do NOT delay link training, and everything
> works as expected.
> 
> Most likely we are just lucky, because dw_pcie_ep_linkdown() calls
> dw_pcie_ep_init_non_sticky_registers(), which is quite a short function.

I'm just making the point that IIUC there's a race between link
training and any DBI accesses done by
dw_pcie_ep_init_non_sticky_registers() and potentially EPF callbacks,
and the time those paths take is immaterial.

If this is indeed a race and this patch is the fix, I think it's
misleading to describe it as "this path might take a long time and
lose the race."  We have to assume arbitrary delays can be added to
either path, so we can never rely on a path being "fast enough" to
avoid the race.

Is the following basically what we're doing?

  Set PCIE_LTSSM_APP_DLY2_EN so the controller never automatically
  trains the link after a link-down interrupt.  That way any DBI
  updates done in the dw_pcie_ep_linkdown() path will happen while the
  link is still down.  Then allow link training by setting
  PCIE_LTSSM_APP_DLY2_DONE.

We don't set PCIE_LTSSM_APP_DLY2_DONE anywhere in the initial probe
path.  Obviously the link must train in that case, so I guess
PCIE_LTSSM_APP_DLY2_EN only applies to the case of link state
transition from link-up to link-down?

> During a hot reset, the BARs get resized to 1 GB (yes, that is the
> default/reset value on rk3588), so the fact that the host sees a smaller
> BAR size means that dw_pcie_ep_init_non_sticky_registers() must have had
> time to run before link training completed.
> 
> But we do not want to rely on luck for these DBI writes to finish before
> link training is complete, hence this patch.
> 
> The .link_down() callback in drivers/pci/endpoint/functions/pci-epf-test.c
> simply does a cancel_delayed_work_sync().
> 
> I could imagine an EPF driver doing some more time consuming work in the
> callback, like allocating memory (which could trigger direct reclaim), and
> then calling pci_epc_set_bar() which will eventually result in some DBI
> writes. That most likely would not work without this patch.
> 
> > From the TRM quote, it sounds like the important thing is that you
> > don't want the link to train before dw_pcie_ep_linkdown() calls
> > dw_pcie_ep_init_non_sticky_registers(), which looks like it programs
> > registers through DBI.
> > 
> > Maybe you also want to allow the EFP ->link_down() callbacks to also
> > program things via DBI before link training?  But I don't think the
> > amount of time they take is relevant.  If you need to do *anything*
> > via DBI before the link trains, you have to prevent training until
> > you're finished with DBI.
> > 
> > > Signed-off-by: Wilfred Mallawa <wilfred.mallawa at wdc.com>
> > > Co-developed-by: Niklas Cassel <cassel at kernel.org>
> > > Signed-off-by: Niklas Cassel <cassel at kernel.org>
> > > ---
> > > Changes since v1:
> > > -Rebased on v6.16-rc1
> > > 
> > >  drivers/pci/controller/dwc/pcie-dw-rockchip.c | 15 ++++++++++++---
> > >  1 file changed, 12 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > > index 93171a392879..cd1e9352b21f 100644
> > > --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > > +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > > @@ -58,6 +58,8 @@
> > >  
> > >  /* Hot Reset Control Register */
> > >  #define PCIE_CLIENT_HOT_RESET_CTRL	0x180
> > > +#define  PCIE_LTSSM_APP_DLY2_EN		BIT(1)
> > > +#define  PCIE_LTSSM_APP_DLY2_DONE	BIT(3)
> > >  #define  PCIE_LTSSM_ENABLE_ENHANCE	BIT(4)
> > >  
> > >  /* LTSSM Status Register */
> > > @@ -474,7 +476,7 @@ static irqreturn_t rockchip_pcie_ep_sys_irq_thread(int irq, void *arg)
> > >  	struct rockchip_pcie *rockchip = arg;
> > >  	struct dw_pcie *pci = &rockchip->pci;
> > >  	struct device *dev = pci->dev;
> > > -	u32 reg;
> > > +	u32 reg, val;
> > >  
> > >  	reg = rockchip_pcie_readl_apb(rockchip, PCIE_CLIENT_INTR_STATUS_MISC);
> > >  	rockchip_pcie_writel_apb(rockchip, reg, PCIE_CLIENT_INTR_STATUS_MISC);
> > > @@ -485,6 +487,10 @@ static irqreturn_t rockchip_pcie_ep_sys_irq_thread(int irq, void *arg)
> > >  	if (reg & PCIE_LINK_REQ_RST_NOT_INT) {
> > >  		dev_dbg(dev, "hot reset or link-down reset\n");
> > >  		dw_pcie_ep_linkdown(&pci->ep);
> > > +		/* Stop delaying link training. */
> > > +		val = HIWORD_UPDATE_BIT(PCIE_LTSSM_APP_DLY2_DONE);
> > > +		rockchip_pcie_writel_apb(rockchip, val,
> > > +					 PCIE_CLIENT_HOT_RESET_CTRL);
> > >  	}
> > >  
> > >  	if (reg & PCIE_RDLH_LINK_UP_CHGED) {
> > > @@ -566,8 +572,11 @@ static int rockchip_pcie_configure_ep(struct platform_device *pdev,
> > >  		return ret;
> > >  	}
> > >  
> > > -	/* LTSSM enable control mode */
> > > -	val = HIWORD_UPDATE_BIT(PCIE_LTSSM_ENABLE_ENHANCE);
> > > +	/*
> > > +	 * LTSSM enable control mode, and automatically delay link training on
> > > +	 * hot reset/link-down reset.
> > > +	 */
> > > +	val = HIWORD_UPDATE_BIT(PCIE_LTSSM_ENABLE_ENHANCE | PCIE_LTSSM_APP_DLY2_EN);
> > >  	rockchip_pcie_writel_apb(rockchip, val, PCIE_CLIENT_HOT_RESET_CTRL);
> > >  
> > >  	rockchip_pcie_writel_apb(rockchip, PCIE_CLIENT_EP_MODE,
> > > -- 
> > > 2.49.0
> > > 



More information about the linux-arm-kernel mailing list