[PATCHv2] nvme-pci: fix stuck reset on concurrent DPC and HP

Keith Busch kbusch at meta.com
Fri Mar 7 15:26:18 PST 2025


From: Keith Busch <kbusch at kernel.org>

The PCIe DPC handling has the nvme driver quiesce the device, attempt to
restart it, then wait for that restart to complete.

The DPC event also toggles the PCIe link. If the slot doesn't have
out-of-band presence detection, this will trigger a pciehp
re-enumeration.

The DPC's error handling that calls nvme_error_resume is holding the
device lock while this happens. This lock prevents pciehp's request to
disconnect the driver from proceeding.

Meanwhile the nvme's reset_work can't make forward progress because its
device isn't there anymore with admin IO, and the timeout handler won't
do anything to fix it because the device is undergoing error handling.

End result: deadlocked.

Fix this by having the timeout handler disable the nvme queueus for a
disconnected PCIe device. We're relying on an IO timeout to unblock
this, which is a minute by default.

Signed-off-by: Keith Busch <kbusch at kernel.org>
---
v1->v2:

  Leveraged the state machine to make the patch that much simpler.

 drivers/nvme/host/pci.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 640590b217282..710f3dfef3663 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1411,9 +1411,12 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
 	struct nvme_dev *dev = nvmeq->dev;
 	struct request *abort_req;
 	struct nvme_command cmd = { };
+	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	u32 csts = readl(dev->bar + NVME_REG_CSTS);
 	u8 opcode;
 
+	if (pci_dev_is_disconnected(pdev))
+		nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
 	if (nvme_state_terminal(&dev->ctrl))
 		goto disable;
 
@@ -1421,7 +1424,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
 	 * the recovery mechanism will surely fail.
 	 */
 	mb();
-	if (pci_channel_offline(to_pci_dev(dev->dev)))
+	if (pci_channel_offline(pdev))
 		return BLK_EH_RESET_TIMER;
 
 	/*
-- 
2.47.1




More information about the Linux-nvme mailing list