[PATCH v7 phy-next 08/27] PCI: Remove device links to PHY

Vladimir Oltean vladimir.oltean at nxp.com
Mon May 4 00:57:42 PDT 2026


Sashiko says:
> commit e4f495a9118b40f8a77e8dc097821d2c846ed1fc
> Author: Vladimir Oltean <vladimir.oltean at nxp.com>
> 
> PCI: Remove device links to PHY
> This commit removes device links between the PCIe controllers and PHY
> providers across several drivers, since the PHY core already creates consumer
> device links.
> 
> > diff --git a/drivers/pci/controller/cadence/pcie-cadence-plat.c b/drivers/pci/controller/cadence/pcie-cadence-plat.c
> > index b067a3296dd37..fc39c01b79648 100644
> > --- a/drivers/pci/controller/cadence/pcie-cadence-plat.c
> > +++ b/drivers/pci/controller/cadence/pcie-cadence-plat.c
> [ ... ]
> > @@ -122,9 +121,6 @@ static int cdns_plat_pcie_probe(struct platform_device *pdev)
> >  	pm_runtime_put_sync(dev);
> >  	pm_runtime_disable(dev);
> >  	cdns_pcie_disable_phy(cdns_plat_pcie->pcie);
> > -	phy_count = cdns_plat_pcie->pcie->phy_count;
> > -	while (phy_count--)
> > -		device_link_del(cdns_plat_pcie->pcie->link[phy_count]);
> >
> >  	return 0;
> 
> This isn't a bug introduced by your patch, but I noticed the error path in
> cdns_plat_pcie_probe() returns 0.
> Does this cause the driver core to consider the probe successful even when
> host or endpoint setup steps fail? It looks like the device would be left
> in an uninitialized and disabled state while still being bound to the driver.
> Should this return ret instead?
> 

The condition does not appear intentional. Commit bd22885aa188 ("PCI:
cadence: Refactor driver to use as a core library") changed
cdns_pcie_host_probe() which properly returned ret on error with
cdns_plat_pcie_probe() which doesn't.

I will add a preparatory patch which replaces "return 0" with "return ret"
in cdns_plat_pcie_probe(), blaming that commit.

> >  }
> > diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
> > index cd904659c3216..9c1aef4d0bd1c 100644
> > --- a/drivers/pci/controller/dwc/pci-dra7xx.c
> > +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
> [ ... ]
> > @@ -852,10 +840,6 @@ static int dra7xx_pcie_probe(struct platform_device *pdev)
> >  	pm_runtime_disable(dev);
> >  	dra7xx_pcie_disable_phy(dra7xx);
> >
> > -err_link:
> > -	while (--i >= 0)
> > -		device_link_del(link[i]);
> > -
> >  	return ret;
> >  }
> 
> This isn't a bug introduced by your changes, but looking closely at the error
> path right above this hunk, there is a call to pm_runtime_put(dev) immediately
> followed by pm_runtime_disable(dev):
> err_gpio:
> err_get_sync:
> 	pm_runtime_put(dev);
> 	pm_runtime_disable(dev);
> Does this sequence create a power management race condition? The pending
> asynchronous idle/suspend work can be abruptly cancelled by
> pm_runtime_disable(), leaving the hardware in an incorrect or unpredictable
> power state.
> Should this use pm_runtime_put_sync() instead?
> 
> > diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> > index fe4836925c4e3..cab9fad212add 100644
> > --- a/drivers/pci/controller/dwc/pci-keystone.c
> > +++ b/drivers/pci/controller/dwc/pci-keystone.c
> [ ... ]
> > @@ -1337,25 +1321,18 @@ static int ks_pcie_probe(struct platform_device *pdev)
> >  	pm_runtime_disable(dev);
> >  	ks_pcie_disable_phy(ks_pcie);
> >
> > -err_link:
> > -	while (--i >= 0 && link[i])
> > -		device_link_del(link[i]);
> > -
> > +err:
> >  	return ret;
> >  }
> >
> >  static void ks_pcie_remove(struct platform_device *pdev)
> >  {
> >  	struct keystone_pcie *ks_pcie = platform_get_drvdata(pdev);
> > -	struct device_link **link = ks_pcie->link;
> > -	int num_lanes = ks_pcie->num_lanes;
> >  	struct device *dev = &pdev->dev;
> >
> >  	pm_runtime_put(dev);
> >  	pm_runtime_disable(dev);
> 
> Similar to the question above, is it possible this asynchronous put causes
> a race when immediately followed by pm_runtime_disable() during removal?
> 
> >  	ks_pcie_disable_phy(ks_pcie);
> > -	while (num_lanes--)
> > -		device_link_del(link[num_lanes]);
> >  }

It seems plausible that the pm_runtime_put() -> pm_runtime_disable()
pattern is ineffective, i.e. one of two things can happen: either
pm_runtime_put() runs to completion by chance, or pm_runtime_disable()
cancels it. However I am not very familiar with the runtime PM API and
its effects, and unless a maintainer tells me to, I would prefer leaving
these code paths alone.



More information about the linux-arm-kernel mailing list