target crash / host hang with nvme-all.3 branch of nvme-fabrics

Christoph Hellwig hch at lst.de
Thu Jun 16 08:10:48 PDT 2016


I think nvmet_rdma_delete_ctrl is getting the exlusion vs other calls
or __nvmet_rdma_queue_disconnect wrong as we rely on a queue that
is undergoing deletion to not be on any list.  Additionally it also
check the cntlid instead of the pointer, which would be harmful if
multiple subsystems have the same cntlid.

Does the following patch help?

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index b1c6e5b..9ae65a7 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1293,19 +1293,21 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id,
 
 static void nvmet_rdma_delete_ctrl(struct nvmet_ctrl *ctrl)
 {
-	struct nvmet_rdma_queue *queue, *next;
-	static LIST_HEAD(del_list);
+	struct nvmet_rdma_queue *queue, *found = NULL;
 
 	mutex_lock(&nvmet_rdma_queue_mutex);
-	list_for_each_entry_safe(queue, next,
-			&nvmet_rdma_queue_list, queue_list) {
-		if (queue->nvme_sq.ctrl->cntlid == ctrl->cntlid)
-			list_move_tail(&queue->queue_list, &del_list);
+	list_for_each_entry(queue, &nvmet_rdma_queue_list, queue_list) {
+		if (queue->nvme_sq.ctrl == ctrl) {
+			list_del_init(&queue->queue_list);
+			found = queue;
+			break;
+		}
 	}
+
 	mutex_unlock(&nvmet_rdma_queue_mutex);
 
-	list_for_each_entry_safe(queue, next, &del_list, queue_list)
-		nvmet_rdma_queue_disconnect(queue);
+	if (found)
+		__nvmet_rdma_queue_disconnect(queue);
 }
 
 static int nvmet_rdma_add_port(struct nvmet_port *port)



More information about the Linux-nvme mailing list