nvmf/rdma host crash during heavy load and keep alive recovery
Steve Wise
swise at opengridcomputing.com
Thu Sep 8 13:47:02 PDT 2016
> >> Does this happen if you change the reconnect delay to be something
> >> different than 10 seconds? (say 30?)
> >>
> >
> > Yes. But I noticed something when performing this experiment that is an
> > important point, I think: if I just bring the network interface down and
leave
> > it down, we don't crash. During this state, I see the host continually
> > reconnecting after the reconnect delay time, timing out trying to reconnect,
and
> > retrying after another reconnect_delay period. I see this for all 10
targets of
> > course. The crash only happens when I bring the interface back up, and the
> > targets begin to reconnect. So the process of successfully reconnecting
the
> > RDMA QPs, and restarting the nvme queues is somehow triggering running an
> nvme
> > request too soon (or perhaps on the wrong queue).
>
> Interesting. Given this is easy to reproduce, can you record the:
> (request_tag, *queue, *qp) for each request submitted?
>
> I'd like to see that the *queue stays the same for each tag
> but the *qp indeed changes.
>
I tried this, and didn't hit the BUG_ON(), yet still hit the crash. I believe
this verifies that *queue never changed...
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c075ea5..a77729e 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -76,6 +76,7 @@ struct nvme_rdma_request {
struct ib_reg_wr reg_wr;
struct ib_cqe reg_cqe;
struct nvme_rdma_queue *queue;
+ struct nvme_rdma_queue *save_queue;
struct sg_table sg_table;
struct scatterlist first_sgl[];
};
@@ -354,6 +355,8 @@ static int __nvme_rdma_init_request(struct nvme_rdma_ctrl
*ctrl,
}
req->queue = queue;
+ if (!req->save_queue)
+ req->save_queue = queue;
return 0;
@@ -1434,6 +1436,9 @@ static int nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
WARN_ON_ONCE(rq->tag < 0);
+ BUG_ON(queue != req->queue);
+ BUG_ON(queue != req->save_queue);
+
dev = queue->device->dev;
ib_dma_sync_single_for_cpu(dev, sqe->dma,
sizeof(struct nvme_command), DMA_TO_DEVICE);
More information about the Linux-nvme
mailing list