[PATCH v2] nvmet: fix ns enable/disable possible hang
Sagi Grimberg
sagi at grimberg.me
Tue May 21 13:20:28 PDT 2024
When disabling an nvmet namespace, there is a period where the
subsys->lock is released, as the ns disable waits for backend IO to
complete, and the ns percpu ref to be properly killed. The original
intent was to avoid taking the subsystem lock for a prolong period as
other processes may need to acquire it (for example new incoming
connections).
However, it opens up a window where another process may come in and
enable the ns, (re)intiailizing the ns percpu_ref, causing the disable
sequence to hang.
Solve this by taking the global nvmet_config_sem over the entire configfs
enable/disable sequence.
Fixes: a07b4970f464 ("nvmet: add a generic NVMe target")
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
Changes from v1:
- Added Fixes tag for stable
- edited changelog to 73 chars
- added code comment explaining the reasoning of taking the global sem
drivers/nvme/target/configfs.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 6a736d28d767..c90d5b37bb4b 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -676,10 +676,18 @@ static ssize_t nvmet_ns_enable_store(struct config_item *item,
if (kstrtobool(page, &enable))
return -EINVAL;
+ /*
+ * take a global nvmet_config_sem because the disable routine has a
+ * window where it releases the subsys-lock, giving a chance to
+ * a parallel enable to concurrently execute causing the disable to
+ * have a misaccounting of the ns percpu_ref.
+ */
+ down_write(&nvmet_config_sem);
if (enable)
ret = nvmet_ns_enable(ns);
else
nvmet_ns_disable(ns);
+ up_write(&nvmet_config_sem);
return ret ? ret : count;
}
--
2.40.1
More information about the Linux-nvme
mailing list