[PATCH v7 0/4] PCI: Add support for resetting the Root Ports in a platform specific way
Manivannan Sadhasivam
mani at kernel.org
Sun May 17 23:21:56 PDT 2026
On Tue, Mar 17, 2026 at 12:16:47PM +0100, Niklas Cassel wrote:
> On Wed, Mar 11, 2026 at 08:44:15PM +0530, Manivannan Sadhasivam wrote:
> > On Wed, Mar 11, 2026 at 08:09:53PM +0530, Manivannan Sadhasivam wrote:
> > > On Wed, Mar 11, 2026 at 12:05:15PM +0100, Niklas Cassel wrote:
> > > > On Tue, Mar 10, 2026 at 07:31:58PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > > > Changes in v7:
> > > > > - Dropped Rockchip Root port reset patch due to reported issues. But the series
> > > > > works on other platforms as tested by others.
> > > >
> > > > Are you referring to
> > > >
> > > > ## On EP side:
> > > > # echo 0 > /sys/kernel/config/pci_ep/controllers/a40000000.pcie-ep/start && \
> > > > sleep 0.1 && echo 1 > /sys/kernel/config/pci_ep/controllers/a40000000.pcie-ep/start
> > > >
> > > > Then running pcitest only having 7 / 16 tests passed ?
> > > >
> > > > If so, isn't that a problem also for qcom?
> > > >
> > >
> > > No, tests are passing on my setup after link up.
> > >
> > > >
> > > > There is no chance that the patch:
> > > > "misc: pci_endpoint_test: Add AER error handlers"
> > > > improves things in this regard?
> > > >
> > > > Or will it simply avoid the "AER: device recovery failed" print?
> > > >
> > >
> > > Yes, as mentioned in the commit message, it just avoids the AER recovery failure
> > > message.
> > >
> >
> > I also realized that Endpoint state is not saved in all the code paths. So the
> > pci_endpoint_test driver has to save/restore the state also. But it is still not
> > clear why that didn't help you.
> >
> > Can you share the snapshot of the entire config space before and after reset
> > using 'lspci -xxxx -s "0000:01:00"'?
>
> If I don't add something like:
>
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index 1eced7a419eb..9d7ee39164d4 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -1059,6 +1059,9 @@ static int pci_endpoint_test_set_irq(struct pci_endpoint_test *test,
> return ret;
> }
>
> + pr_info("saving PCI state (irq_type: %d)\n", req_irq_type);
> + pci_save_state(pdev);
> +
> return 0;
> }
>
> @@ -1453,6 +1456,7 @@ static pci_ers_result_t pci_endpoint_test_error_detected(struct pci_dev *pdev,
>
> static pci_ers_result_t pci_endpoint_test_slot_reset(struct pci_dev *pdev)
> {
> + pci_restore_state(pdev);
> return PCI_ERS_RESULT_RECOVERED;
> }
>
> On top of your patch.
>
> Then all the BAR tests + MSI and MSI-X tests fail.
>
> There is a huge difference in lspci -vvv output (as I guess is expected),
> including all BARs being marked as disabled.
>
>
> With the patch above. There is zero difference before/after reset, and all
> the BAR tests pass. However, MSI/MSI-X tests still fail with:
>
> # pci_endpoint_test.c:143:MSI_TEST:Expected 0 (0) == ret (-110)
> # pci_endpoint_test.c:143:MSI_TEST:Test failed for MSI1
>
> ETIMEDOUT.
>
> This suggests that pci_endpoint_test on the host side did not receive an
> interrupt.
>
> I don't know why, but considering that lspci output is now (with the
> save+restore) identical, I assume that the problem is not related to
> the host. Unless somehow the host will use a new/different MSI address
> after the root port has been reset, and we restore the old MSI address,
> but looking at the code, dw_pcie_msi_init() is called by
> dw_pcie_setup_rc(), so I would expect the MSI address to be the same.
>
Hi Niklas,
When I rebased this series on top of v7.1-rc1, I ended up seeing the issue what
you described here (not sure why I didn't see it earlier). So after the Root
Port reset, MSI tests fail, but BAR tests succeed. Also, I got IOMMU faults on
the host after endpoint triggers MSI.
I investigated it and found that the MSI iATU mapping gets cleared in hw after
LDn happens. But the host continues to use the same address/size for the
endpoint MSI even after reset. Due to this, the existing checks in
dw_pcie_ep_raise_msi_irq() don't pass and the stale MSI iATU mapping gets
reused.
The fix would be to clear the mapping in dw_pcie_ep_cleanup(), which gets called
as part of the PERST# assert/deassert sequence post LDn and also set
msi_iatu_mapped flag to 'false'. This will force dw_pcie_ep_raise_msi_irq() to
use fresh iATU mapping when it gets called for the first time:
diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index d4dc3b24da60..4ae0e1b55f39 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -1035,6 +1035,11 @@ void dw_pcie_ep_cleanup(struct dw_pcie_ep *ep)
{
struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+ if (ep->msi_iatu_mapped) {
+ dw_pcie_ep_unmap_addr(ep->epc, 0, 0, ep->msi_mem_phys);
+ ep->msi_iatu_mapped = false;
+ }
+
dwc_pcie_debugfs_deinit(pci);
dw_pcie_edma_remove(pci);
}
With this change, MSI works after Root Port reset without any issues on our Qcom
endpoint/host setup.
Please test this change on your rockchip setup as well. You have to make sure
that dw_pcie_ep_cleanup() is called during PERST# assert/deassert.
I'm going to respin the series with this fix. If you confirm it works for you,
then we can merge your Rockchip Root Port change.
- Mani
--
மணிவண்ணன் சதாசிவம்
More information about the Linux-rockchip
mailing list