[PATCH v1 0/1] nvme: Fix problem when booting from NVMe drive was leading to a hang.

Michael Kropaczek michael.kropaczek at solidigm.com
Wed Feb 21 00:43:40 PST 2024


Description:

During endurance test, when a system was rebooted from NVMe drive, boot
process hung occasionally. The number of reboot cycles was set to 1000,
with interval of 120s. Hang occurred after ~300 reboot cycles.
After investigating the cause, it was established that NVMe driver
did not disable host memory during shutdown leaving NVMe controller
in a state preventing proper initialization in BIOS pre-boot stage.
Adding of the nvme_free_host_mem(dev) call fixed the issue.

Michael Kropaczek (1):
  nvme: Fix problem when booting from NVMe drive was leading to a hang.

 drivers/nvme/host/pci.c | 10 ++++++++++
 1 file changed, 10 insertions(+)


base-commit: 8d30528a170905ede9ab6ab81f229e441808590b
-- 
2.34.1

>From 1506c5099def8ecdd489d681033230492fa65bb2 Mon Sep 17 00:00:00 2001
From: Michael Kropaczek <michael.kropaczek at solidigm.com>
Date: Tue, 20 Feb 2024 23:40:36 -0800
Subject: [PATCH v1 1/1] nvme: Fix problem when booting from NVMe drive was
 leading to a hang.
To: linux-nvme at lists.infradead.org
Cc: Keith Busch <kbusch at kernel.org>,
    Jens Axboe <axboe at fb.com>,
    Christoph Hellwig <hch at lst.de>,
    Sagi Grimberg <sagi at grimberg.me>,
    Michael Kropaczek <michael.kropaczek at solidigm.com>

On certain host architectures/HW, DRAM was keeping memory contents over reboot
cycles. Certain NVMe controllers were accessing host memory after startup which
led to undefined state, preventing proper initialization in BIOS boot stage.
Freeing host memory during host's shutdown prevents the problem from occurring.

Signed-off-by: Michael Kropaczek <michael.kropaczek at solidigm.com>
---
 drivers/nvme/host/pci.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e6267a6aa380..ccddb7c379e3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2593,6 +2593,16 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
 	}
 
+	/*
+	 * On certain host architectures/HW, DRAM was keeping memory contents over reboot-cycles.
+	 * It was observed that certain controllers were accessing host memory after
+	 * resetting which led to undefined state preventing proper initialization.
+	 */
+	if (dev->hmb)
+		nvme_set_host_mem(dev, 0);
+
+	nvme_free_host_mem(dev);
+
 	nvme_quiesce_io_queues(&dev->ctrl);
 
 	if (!dead && dev->ctrl.queue_count > 0) {
-- 
2.34.1




More information about the Linux-nvme mailing list