nvmf/rdma host crash during heavy load and keep alive recovery

Sagi Grimberg sagi at grimberg.me
Thu Sep 8 00:45:35 PDT 2016


>> Does this happen if you change the reconnect delay to be something
>> different than 10 seconds? (say 30?)
>>
>
> Yes.  But I noticed something when performing this experiment that is an
> important point, I think:  if I just bring the network interface down and leave
> it down, we don't crash.  During this state, I see the host continually
> reconnecting after the reconnect delay time, timing out trying to reconnect, and
> retrying after another reconnect_delay period.  I see this for all 10 targets of
> course.  The crash only happens when I bring the interface back up, and the
> targets begin to reconnect.   So the process of successfully reconnecting the
> RDMA QPs, and restarting the nvme queues is somehow triggering running an nvme
> request too soon (or perhaps on the wrong queue).

Interesting. Given this is easy to reproduce, can you record the:
(request_tag, *queue, *qp) for each request submitted?

I'd like to see that the *queue stays the same for each tag
but the *qp indeed changes.

>> Can you also give patch [1] a try? It's not a solution, but I want
>> to see if it hides the problem...
>>
>
> hmm.  I ran the experiment once with [1] and it didn't crash.  I ran it a 2nd
> time and hit a new crash.  Maybe a problem with [1]?

Strange, I don't see how we can visit rdma_destroy_qp twice given
that we have NVME_RDMA_IB_QUEUE_ALLOCATED bit protecting it.

Not sure if it fixes anything, but we probably need it regardless, can
you give another go with this on top:
--
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 43602cebf097..89023326f397 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -542,11 +542,12 @@ static int nvme_rdma_create_queue_ib(struct 
nvme_rdma_queue *queue,
                 goto out_destroy_qp;
         }
         set_bit(NVME_RDMA_IB_QUEUE_ALLOCATED, &queue->flags);
+       clear_bit(NVME_RDMA_Q_DELETING, &queue->flags);

         return 0;

  out_destroy_qp:
-       ib_destroy_qp(queue->qp);
+       rdma_destroy_qp(queue->qp);
  out_destroy_ib_cq:
         ib_free_cq(queue->ib_cq);
  out:
@@ -591,15 +592,16 @@ static int nvme_rdma_init_queue(struct 
nvme_rdma_ctrl *ctrl,
         if (ret) {
                 dev_info(ctrl->ctrl.device,
                         "rdma_resolve_addr wait failed (%d).\n", ret);
-               goto out_destroy_cm_id;
+               goto out_destroy_queue_id;
         }

         set_bit(NVME_RDMA_Q_CONNECTED, &queue->flags);

         return 0;

-out_destroy_cm_id:
+out_destroy_queue_ib:
         nvme_rdma_destroy_queue_ib(queue);
+out_destroy_cm_id:
         rdma_destroy_id(queue->cm_id);
         return ret;
  }
--



More information about the Linux-nvme mailing list