[PATCH v4 4/5] PCI: host-common: Add link down handling for host bridges
Bjorn Helgaas
helgaas at kernel.org
Mon Jun 2 14:19:44 PDT 2025
On Thu, May 08, 2025 at 12:40:33PM +0530, Manivannan Sadhasivam wrote:
> The PCI link, when down, needs to be recovered to bring it back. But that
> cannot be done in a generic way as link recovery procedure is specific to
> host bridges. So add a new API pci_host_handle_link_down() that could be
> called by the host bridge drivers when the link goes down.
IIUC you plumbed this into the reset path so the standard entries
(pci_reset_function() and the sysfs "reset" files) work can now work
for Root Ports on DT systems just like they do for ACPI systems
(assuming the ACPI systems supply an _RST method for the ports). That
all sounds good.
> The API will iterate through all the slots and calls the pcie_do_recovery()
> function with 'pci_channel_io_frozen' as the state. This will result in the
> execution of the AER Fatal error handling code. Since the link down
> recovery is pretty much the same as AER Fatal error handling,
> pcie_do_recovery() helper is reused here. First the AER error_detected
> callback will be triggered for the bridge and the downstream devices. Then,
> pci_host_reset_slot() will be called for the slot, which will reset the
> slot using 'reset_slot' callback to recover the link. Once that's done,
> resume message will be broadcasted to the bridge and the downstream devices
> indicating successful link recovery.
We have standard PCIe mechanisms to learn about "link down" events,
e.g., AER Surprise Down error reporting and the Data Link Layer State
Changed events for hot-plug capable ports.
How does this controller-specific "link down" notification relate to
those? Is this for controllers that don't support those AER or
hotplug mechanisms? Or is this a different scenario that wouldn't be
covered by them?
If AER is enabled, do we get both the AER interrupt and the controller
"link down" interrupt?
> In case if the AER support is not enabled in the kernel, only
> pci_bus_error_reset() will be called for each slots as there is no way we
> could inform the drivers about link recovery.
More information about the linux-riscv
mailing list