[PATCH v7 2/4] PCI: host-common: Add link down handling for Root Ports

Shawn Lin shawn.lin at rock-chips.com
Tue Mar 10 22:20:55 PDT 2026


在 2026/03/11 星期三 13:04, Manivannan Sadhasivam 写道:
> On Wed, Mar 11, 2026 at 08:55:01AM +0800, Shawn Lin wrote:
>> Hi Mani
>>
>> 在 2026/03/10 星期二 22:02, Manivannan Sadhasivam via B4 Relay 写道:
>>> From: Manivannan Sadhasivam <mani at kernel.org>
>>>
>>> The PCI link, when down, needs to be recovered to bring it back. But on
>>> some platforms, that cannot be done in a generic way as link recovery
>>> procedure is platform specific. So add a new API
>>> pci_host_handle_link_down() that could be called by the host bridge drivers
>>> for a specific Root Port when the link goes down.
>>>
>>> The API accepts the 'pci_dev' corresponding to the Root Port which observed
>>> the link down event. If CONFIG_PCIEAER is enabled, the API calls
>>> pcie_do_recovery() function with 'pci_channel_io_frozen' as the state. This
>>> will result in the execution of the AER Fatal error handling code. Since
>>> the link down recovery is pretty much the same as AER Fatal error handling,
>>> pcie_do_recovery() helper is reused here. First, the AER error_detected()
>>> callback will be triggered for the bridge and then for the downstream
>>> devices. Finally, pci_host_reset_root_port() will be called for the Root
>>> Port, which will reset the Root Port using 'reset_root_port' callback to
>>> recover the link. Once that's done, resume message will be broadcasted to
>>> the bridge and the downstream devices, indicating successful link recovery.
>>>
>>> But if CONFIG_PCIEAER is not enabled in the kernel, only
>>> pci_host_reset_root_port() API will be called, which will in turn call
>>> pci_bus_error_reset() to just reset the Root Port as there is no way we
>>> could inform the drivers about link recovery.
>>>
>>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam at linaro.org>
>>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam at oss.qualcomm.com>
>>> Tested-by: Brian Norris <briannorris at chromium.org>
>>> Tested-by: Krishna Chaitanya Chundru <krishna.chundru at oss.qualcomm.com>
>>> Tested-by: Richard Zhu <hongxing.zhu at nxp.com>
>>> Reviewed-by: Frank Li <Frank.Li at nxp.com>
>>> ---
>>>    drivers/pci/controller/pci-host-common.c | 35 ++++++++++++++++++++++++++++++++
>>>    drivers/pci/controller/pci-host-common.h |  1 +
>>>    drivers/pci/pci.c                        |  1 +
>>>    drivers/pci/pcie/err.c                   |  1 +
>>>    4 files changed, 38 insertions(+)
>>>
>>> diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
>>> index d6258c1cffe5..15ebff8a542a 100644
>>> --- a/drivers/pci/controller/pci-host-common.c
>>> +++ b/drivers/pci/controller/pci-host-common.c
>>> @@ -12,9 +12,11 @@
>>>    #include <linux/of.h>
>>>    #include <linux/of_address.h>
>>>    #include <linux/of_pci.h>
>>> +#include <linux/pci.h>
>>>    #include <linux/pci-ecam.h>
>>>    #include <linux/platform_device.h>
>>> +#include "../pci.h"
>>>    #include "pci-host-common.h"
>>>    static void gen_pci_unmap_cfg(void *ptr)
>>> @@ -106,5 +108,38 @@ void pci_host_common_remove(struct platform_device *pdev)
>>>    }
>>>    EXPORT_SYMBOL_GPL(pci_host_common_remove);
>>> +static pci_ers_result_t pci_host_reset_root_port(struct pci_dev *dev)
>>> +{
>>> +	int ret;
>>> +
>>> +	pci_lock_rescan_remove();
>>> +	ret = pci_bus_error_reset(dev);
>>> +	pci_unlock_rescan_remove();
>>> +	if (ret) {
>>> +		pci_err(dev, "Failed to reset Root Port: %d\n", ret);
>>> +		return PCI_ERS_RESULT_DISCONNECT;
>>> +	}
>>> +
>>> +	pci_info(dev, "Root Port has been reset\n");
>>> +
>>> +	return PCI_ERS_RESULT_RECOVERED;
>>> +}
>>> +
>>> +static void pci_host_recover_root_port(struct pci_dev *port)
>>> +{
>>> +#if IS_ENABLED(CONFIG_PCIEAER)
>>> +	pcie_do_recovery(port, pci_channel_io_frozen, pci_host_reset_root_port);
>>> +#else
>>> +	pci_host_reset_root_port(port);
>>
>> Since pci_host_reset_root_port() returns pci_ers_result_t, shouldn't we
>> check the result? If the return value is intentionally ignored here,
>> maybe pci_host_reset_root_port actually doesn't need a return value at
>> all?
>>
> 
> The return value is mostly for pcie_do_recovery() which iterates through the
> subordinate devices and calls pci_host_reset_root_port(). It also makes use of
> the return value, so we cannot make it void.
> 
> The reason why I skipped the return value in pci_host_handle_link_down() is
> that, we canot do much in the case of failure other than reporting the failure,
> which is already taken care in pci_host_reset_root_port().
> 

Ok, it makes sense to me, thanks for the explanation.
Feel free to add:

Reviewed-by: Shawn Lin <shawn.lin at rock-chips.com>

>>> +#endif
>>> +}
>>> +
>>> +void pci_host_handle_link_down(struct pci_dev *port)
>>> +{
>>> +	pci_info(port, "Recovering Root Port due to Link Down\n");
>>> +	pci_host_recover_root_port(port);
>>> +}
>>> +EXPORT_SYMBOL_GPL(pci_host_handle_link_down);
>>
>> This function shouldn't be called like in interrupt context because of
>> the pci_lock_rescan_remove() and pci_bus_error_reset()::pci_slot_mutex,
>> but it's not so obvious from the API name. It's prone for host drivers
>> to use it like:
>>
>> register_LDn_irq -> irq isr -> pci_host_handle_link_down()
>>
>> So perhaps add a comment about it would be better.
>>
> 
> Yes, I agree. I mentioned in the cover letter that this API should be called
> from a threaded IRQ handler, but it should be mentioned in the API description
> too. I will add it in next version or ammend it while applying.
> 
> - Mani
> 



More information about the Linux-rockchip mailing list