nvmf/rdma host crash during heavy load and keep alive recovery

Sagi Grimberg sagi at grimberg.me
Fri Sep 23 16:57:21 PDT 2016


> The hctx.state has BLK_MQ_S_TAG_ACTIVE set and _not_ BLK_MQ_S_STOPPED.  The
> ns->queue->queue_flags has QUEUE_FLAG_STOPPED bit set.  So the blk_mq queue
> is active and the nvme queue is STOPPED.  I don't know how it gets in this
> state...

Christoph,

I'm still trying to understand how it is possible to
get to a point where the request queue is stopped while
the hardware context is not...

The code in rdma.c seems to do the right thing, but somehow
a stray request sneaks in to our submission path when its not
expected to.

Steve, is the request a normal read/write? or is it something
else triggered from the reconnect flow?



More information about the Linux-nvme mailing list