[RFC PATCH] nvme: fix RCU hole that allowed for endless looping in multipath round robin
Chris Leech
cleech at redhat.com
Mon Mar 21 15:43:04 PDT 2022
Make nvme_ns_remove match the assumptions elsewhere.
1) !NVME_NS_READY needs to be srcu synchronized to make sure nothing is
running in __nvme_find_path or nvme_round_robin_path that will
re-assign this ns to current_path.
2) Any matching current_path entries need to be cleared before removing
from the siblings list, to prevent calling nvme_round_robin_path with
an "old" ns that's off list.
3) Finally the list_del_rcu can happen, and then synchronize again
before releasing any reference counts.
---
drivers/nvme/host/core.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fd4720d37cc0..20778dc9224c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3917,6 +3917,15 @@ static void nvme_ns_remove(struct nvme_ns *ns)
set_capacity(ns->disk, 0);
nvme_fault_inject_fini(&ns->fault_inject);
+ /* ensure that !NVME_NS_READY is seen
+ * to prevent this ns going back in current_path
+ */
+ synchronize_srcu(&ns->head->srcu);
+
+ /* wait for concurrent submissions */
+ if (nvme_mpath_clear_current_path(ns))
+ synchronize_srcu(&ns->head->srcu);
+
mutex_lock(&ns->ctrl->subsys->lock);
list_del_rcu(&ns->siblings);
if (list_empty(&ns->head->list)) {
@@ -3928,10 +3937,6 @@ static void nvme_ns_remove(struct nvme_ns *ns)
/* guarantee not available in head->list */
synchronize_rcu();
- /* wait for concurrent submissions */
- if (nvme_mpath_clear_current_path(ns))
- synchronize_srcu(&ns->head->srcu);
-
if (!nvme_ns_head_multipath(ns->head))
nvme_cdev_del(&ns->cdev, &ns->cdev_device);
del_gendisk(ns->disk);
--
2.35.1
More information about the Linux-nvme
mailing list