nvmf/rdma host crash during heavy load and keep alive recovery
Steve Wise
swise at opengridcomputing.com
Thu Sep 8 10:19:07 PDT 2016
> >> Now, given that you already verified that the queues are stopped with
> >> BLK_MQ_S_STOPPED, I'm looking at blk-mq now.
> >>
> >> I see that blk_mq_run_hw_queue() and __blk_mq_run_hw_queue() indeed take
> >> BLK_MQ_S_STOPPED into account. Theoretically if we free the queue
> >> pairs after we passed these checks while the rq_list is being processed
> >> then we can end-up with this condition, but given that it takes
> >> essentially forever (10 seconds) I tend to doubt this is the case.
> >>
> >> HCH, Jens, Keith, any useful pointers for us?
> >>
> >> To summarize we see a stray request being queued long after we set
> >> BLK_MQ_S_STOPPED (and by long I mean 10 seconds).
> >
> > Does nvme-rdma need to call blk_mq_queue_reinit() after it reinits the tag
set
> > for that queue as part of reconnecting?
>
> I don't see how that'd help...
>
I can't explain this, but the nvme_rdma_queue.flags field has a bit set that
shouldn't be set:
crash> nvme_rdma_queue.flags -x ffff880e52b8e7e8
flags = 0x14
Bit 2 is set, NVME_RDMA_Q_DELETING, but bit 4 is also set and should never be...
enum nvme_rdma_queue_flags {
NVME_RDMA_Q_CONNECTED = (1 << 0),
NVME_RDMA_IB_QUEUE_ALLOCATED = (1 << 1),
NVME_RDMA_Q_DELETING = (1 << 2),
};
The rest of the structure looks fine. I've also seen crash dumps where bit 3 is
set which is also not used.
/me confused...
More information about the Linux-nvme
mailing list