nvmf/rdma host crash during heavy load and keep alive recovery

Thu Sep 15 08:10:57 PDT 2016

> The state of the controller is NVME_CTRL_RECONNECTING.  In fact, this BUG_ON()
> happened on the reconnect worker thread.   Ah, this is probably the connect
> command on the admin queue maybe?
> 
> 
> 

The queue being used at the crash is nvme_rdma_ctrl->queues[1].  IE not the
admin queue.  The reconnect work thread is connecting the io queues here:

crash> gdb list *nvme_rdma_reconnect_ctrl_work+0x14b
0xffffffffa065cafb is in nvme_rdma_reconnect_ctrl_work
(drivers/nvme/host/rdma.c:647).
642     {
643             int i, ret = 0;
644
645             for (i = 1; i < ctrl->queue_count; i++) {
646                     ret = nvmf_connect_io_queue(&ctrl->ctrl, i);
647                     if (ret)
648                             break;
649             }
650
651             return ret;

nvmf_connect_io_queue() is here:

crash> gdb list *nvmf_connect_io_queue+0x114
0xffffffffa064d134 is in nvmf_connect_io_queue
(drivers/nvme/host/fabrics.c:454).
449             strncpy(data->hostnqn, ctrl->opts->host->nqn, NVMF_NQN_SIZE);
450
451             ret = __nvme_submit_sync_cmd(ctrl->connect_q, &cmd, &cqe,
452                             data, sizeof(*data), 0, qid, 1,
453                             BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
454             if (ret) {
455                     nvmf_log_connect_error(ctrl, ret,
le32_to_cpu(cqe.result),
456                                            &cmd, data);
457             }
458             kfree(data);

The hctx passed into nvme_rdma_queue_rq() is in state BLK_MQ_S_TAG_ACTIVE.  And
hctx->driver_data is the nvme_rdma_queue to be used.  That nvme_rdma_queue has a
different hctx pointer (from my debug code) and that's why we hit the BUG_ON().
Anyway, nvme_rdma_queue->hctx->state is BLK_MQ_S_STOPPED.  So this is more
evidence that somehow an hctx is using an nvme_rdma_queue that wasn't originally
assigned to that hctx...