[PATCH v7 2/4] PCI: host-common: Add link down handling for Root Ports

Manivannan Sadhasivam mani at kernel.org
Tue Mar 10 22:04:17 PDT 2026


On Wed, Mar 11, 2026 at 08:55:01AM +0800, Shawn Lin wrote:
> Hi Mani
> 
> 在 2026/03/10 星期二 22:02, Manivannan Sadhasivam via B4 Relay 写道:
> > From: Manivannan Sadhasivam <mani at kernel.org>
> > 
> > The PCI link, when down, needs to be recovered to bring it back. But on
> > some platforms, that cannot be done in a generic way as link recovery
> > procedure is platform specific. So add a new API
> > pci_host_handle_link_down() that could be called by the host bridge drivers
> > for a specific Root Port when the link goes down.
> > 
> > The API accepts the 'pci_dev' corresponding to the Root Port which observed
> > the link down event. If CONFIG_PCIEAER is enabled, the API calls
> > pcie_do_recovery() function with 'pci_channel_io_frozen' as the state. This
> > will result in the execution of the AER Fatal error handling code. Since
> > the link down recovery is pretty much the same as AER Fatal error handling,
> > pcie_do_recovery() helper is reused here. First, the AER error_detected()
> > callback will be triggered for the bridge and then for the downstream
> > devices. Finally, pci_host_reset_root_port() will be called for the Root
> > Port, which will reset the Root Port using 'reset_root_port' callback to
> > recover the link. Once that's done, resume message will be broadcasted to
> > the bridge and the downstream devices, indicating successful link recovery.
> > 
> > But if CONFIG_PCIEAER is not enabled in the kernel, only
> > pci_host_reset_root_port() API will be called, which will in turn call
> > pci_bus_error_reset() to just reset the Root Port as there is no way we
> > could inform the drivers about link recovery.
> > 
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam at linaro.org>
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam at oss.qualcomm.com>
> > Tested-by: Brian Norris <briannorris at chromium.org>
> > Tested-by: Krishna Chaitanya Chundru <krishna.chundru at oss.qualcomm.com>
> > Tested-by: Richard Zhu <hongxing.zhu at nxp.com>
> > Reviewed-by: Frank Li <Frank.Li at nxp.com>
> > ---
> >   drivers/pci/controller/pci-host-common.c | 35 ++++++++++++++++++++++++++++++++
> >   drivers/pci/controller/pci-host-common.h |  1 +
> >   drivers/pci/pci.c                        |  1 +
> >   drivers/pci/pcie/err.c                   |  1 +
> >   4 files changed, 38 insertions(+)
> > 
> > diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
> > index d6258c1cffe5..15ebff8a542a 100644
> > --- a/drivers/pci/controller/pci-host-common.c
> > +++ b/drivers/pci/controller/pci-host-common.c
> > @@ -12,9 +12,11 @@
> >   #include <linux/of.h>
> >   #include <linux/of_address.h>
> >   #include <linux/of_pci.h>
> > +#include <linux/pci.h>
> >   #include <linux/pci-ecam.h>
> >   #include <linux/platform_device.h>
> > +#include "../pci.h"
> >   #include "pci-host-common.h"
> >   static void gen_pci_unmap_cfg(void *ptr)
> > @@ -106,5 +108,38 @@ void pci_host_common_remove(struct platform_device *pdev)
> >   }
> >   EXPORT_SYMBOL_GPL(pci_host_common_remove);
> > +static pci_ers_result_t pci_host_reset_root_port(struct pci_dev *dev)
> > +{
> > +	int ret;
> > +
> > +	pci_lock_rescan_remove();
> > +	ret = pci_bus_error_reset(dev);
> > +	pci_unlock_rescan_remove();
> > +	if (ret) {
> > +		pci_err(dev, "Failed to reset Root Port: %d\n", ret);
> > +		return PCI_ERS_RESULT_DISCONNECT;
> > +	}
> > +
> > +	pci_info(dev, "Root Port has been reset\n");
> > +
> > +	return PCI_ERS_RESULT_RECOVERED;
> > +}
> > +
> > +static void pci_host_recover_root_port(struct pci_dev *port)
> > +{
> > +#if IS_ENABLED(CONFIG_PCIEAER)
> > +	pcie_do_recovery(port, pci_channel_io_frozen, pci_host_reset_root_port);
> > +#else
> > +	pci_host_reset_root_port(port);
> 
> Since pci_host_reset_root_port() returns pci_ers_result_t, shouldn't we
> check the result? If the return value is intentionally ignored here,
> maybe pci_host_reset_root_port actually doesn't need a return value at
> all?
> 

The return value is mostly for pcie_do_recovery() which iterates through the
subordinate devices and calls pci_host_reset_root_port(). It also makes use of
the return value, so we cannot make it void.

The reason why I skipped the return value in pci_host_handle_link_down() is
that, we canot do much in the case of failure other than reporting the failure,
which is already taken care in pci_host_reset_root_port().

> > +#endif
> > +}
> > +
> > +void pci_host_handle_link_down(struct pci_dev *port)
> > +{
> > +	pci_info(port, "Recovering Root Port due to Link Down\n");
> > +	pci_host_recover_root_port(port);
> > +}
> > +EXPORT_SYMBOL_GPL(pci_host_handle_link_down);
> 
> This function shouldn't be called like in interrupt context because of
> the pci_lock_rescan_remove() and pci_bus_error_reset()::pci_slot_mutex,
> but it's not so obvious from the API name. It's prone for host drivers
> to use it like:
> 
> register_LDn_irq -> irq isr -> pci_host_handle_link_down()
> 
> So perhaps add a comment about it would be better.
> 

Yes, I agree. I mentioned in the cover letter that this API should be called
from a threaded IRQ handler, but it should be mentioned in the API description
too. I will add it in next version or ammend it while applying.

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the Linux-rockchip mailing list