[PATCH v7 2/4] PCI: host-common: Add link down handling for Root Ports
Shawn Lin
shawn.lin at rock-chips.com
Tue Mar 10 17:55:01 PDT 2026
Hi Mani
在 2026/03/10 星期二 22:02, Manivannan Sadhasivam via B4 Relay 写道:
> From: Manivannan Sadhasivam <mani at kernel.org>
>
> The PCI link, when down, needs to be recovered to bring it back. But on
> some platforms, that cannot be done in a generic way as link recovery
> procedure is platform specific. So add a new API
> pci_host_handle_link_down() that could be called by the host bridge drivers
> for a specific Root Port when the link goes down.
>
> The API accepts the 'pci_dev' corresponding to the Root Port which observed
> the link down event. If CONFIG_PCIEAER is enabled, the API calls
> pcie_do_recovery() function with 'pci_channel_io_frozen' as the state. This
> will result in the execution of the AER Fatal error handling code. Since
> the link down recovery is pretty much the same as AER Fatal error handling,
> pcie_do_recovery() helper is reused here. First, the AER error_detected()
> callback will be triggered for the bridge and then for the downstream
> devices. Finally, pci_host_reset_root_port() will be called for the Root
> Port, which will reset the Root Port using 'reset_root_port' callback to
> recover the link. Once that's done, resume message will be broadcasted to
> the bridge and the downstream devices, indicating successful link recovery.
>
> But if CONFIG_PCIEAER is not enabled in the kernel, only
> pci_host_reset_root_port() API will be called, which will in turn call
> pci_bus_error_reset() to just reset the Root Port as there is no way we
> could inform the drivers about link recovery.
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam at linaro.org>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam at oss.qualcomm.com>
> Tested-by: Brian Norris <briannorris at chromium.org>
> Tested-by: Krishna Chaitanya Chundru <krishna.chundru at oss.qualcomm.com>
> Tested-by: Richard Zhu <hongxing.zhu at nxp.com>
> Reviewed-by: Frank Li <Frank.Li at nxp.com>
> ---
> drivers/pci/controller/pci-host-common.c | 35 ++++++++++++++++++++++++++++++++
> drivers/pci/controller/pci-host-common.h | 1 +
> drivers/pci/pci.c | 1 +
> drivers/pci/pcie/err.c | 1 +
> 4 files changed, 38 insertions(+)
>
> diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
> index d6258c1cffe5..15ebff8a542a 100644
> --- a/drivers/pci/controller/pci-host-common.c
> +++ b/drivers/pci/controller/pci-host-common.c
> @@ -12,9 +12,11 @@
> #include <linux/of.h>
> #include <linux/of_address.h>
> #include <linux/of_pci.h>
> +#include <linux/pci.h>
> #include <linux/pci-ecam.h>
> #include <linux/platform_device.h>
>
> +#include "../pci.h"
> #include "pci-host-common.h"
>
> static void gen_pci_unmap_cfg(void *ptr)
> @@ -106,5 +108,38 @@ void pci_host_common_remove(struct platform_device *pdev)
> }
> EXPORT_SYMBOL_GPL(pci_host_common_remove);
>
> +static pci_ers_result_t pci_host_reset_root_port(struct pci_dev *dev)
> +{
> + int ret;
> +
> + pci_lock_rescan_remove();
> + ret = pci_bus_error_reset(dev);
> + pci_unlock_rescan_remove();
> + if (ret) {
> + pci_err(dev, "Failed to reset Root Port: %d\n", ret);
> + return PCI_ERS_RESULT_DISCONNECT;
> + }
> +
> + pci_info(dev, "Root Port has been reset\n");
> +
> + return PCI_ERS_RESULT_RECOVERED;
> +}
> +
> +static void pci_host_recover_root_port(struct pci_dev *port)
> +{
> +#if IS_ENABLED(CONFIG_PCIEAER)
> + pcie_do_recovery(port, pci_channel_io_frozen, pci_host_reset_root_port);
> +#else
> + pci_host_reset_root_port(port);
Since pci_host_reset_root_port() returns pci_ers_result_t, shouldn't we
check the result? If the return value is intentionally ignored here,
maybe pci_host_reset_root_port actually doesn't need a return value at
all?
> +#endif
> +}
> +
> +void pci_host_handle_link_down(struct pci_dev *port)
> +{
> + pci_info(port, "Recovering Root Port due to Link Down\n");
> + pci_host_recover_root_port(port);
> +}
> +EXPORT_SYMBOL_GPL(pci_host_handle_link_down);
This function shouldn't be called like in interrupt context because of
the pci_lock_rescan_remove() and pci_bus_error_reset()::pci_slot_mutex,
but it's not so obvious from the API name. It's prone for host drivers
to use it like:
register_LDn_irq -> irq isr -> pci_host_handle_link_down()
So perhaps add a comment about it would be better.
> +
> MODULE_DESCRIPTION("Common library for PCI host controller drivers");
> MODULE_LICENSE("GPL v2");
> diff --git a/drivers/pci/controller/pci-host-common.h b/drivers/pci/controller/pci-host-common.h
> index b5075d4bd7eb..dd12dd1a1b23 100644
> --- a/drivers/pci/controller/pci-host-common.h
> +++ b/drivers/pci/controller/pci-host-common.h
> @@ -17,6 +17,7 @@ int pci_host_common_init(struct platform_device *pdev,
> struct pci_host_bridge *bridge,
> const struct pci_ecam_ops *ops);
> void pci_host_common_remove(struct platform_device *pdev);
> +void pci_host_handle_link_down(struct pci_dev *port);
>
> struct pci_config_window *pci_host_common_ecam_create(struct device *dev,
> struct pci_host_bridge *bridge, const struct pci_ecam_ops *ops);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 6f09057d83e0..1b37bfe6d079 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5650,6 +5650,7 @@ int pci_bus_error_reset(struct pci_dev *bridge)
> mutex_unlock(&pci_slot_mutex);
> return pci_bus_reset(bridge->subordinate, PCI_RESET_DO_RESET);
> }
> +EXPORT_SYMBOL_GPL(pci_bus_error_reset);
>
> /**
> * pci_probe_reset_bus - probe whether a PCI bus can be reset
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 13b9d9eb714f..d77403d8855b 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -292,3 +292,4 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>
> return status;
> }
> +EXPORT_SYMBOL_GPL(pcie_do_recovery);
>
More information about the Linux-rockchip
mailing list