nvmf/rdma host crash during heavy load and keep alive recovery
Steve Wise
swise at opengridcomputing.com
Tue Sep 27 08:31:51 PDT 2016
> Hey Christoph,
>
> To apply Bart's series, I needed to use Jens' for-4.9/block branch. But I
also
> wanted the latest nvme fixes in linux-4.8-rc8, so I rebased Jens' branch onto
> rc8, then applied Bart's series (which needed a small tweak to patch 2). On
top
> of this I have some debug patches that will BUG_ON() if it detects freed RDMA
> objects (requires mem debug on so freed memory has the 0x6b6b... stamp).
This
> code base can be perused at:
>
> https://github.com/larrystevenwise/nvme-fabrics/commits/block-for-4.9
>
> I then tried to reproduce, and still hit a crash. I'm debugging now.
>
blk_mq_hw_ctx.state is: 2
nvme_ns.queue.queue_flags is: 0x1f07a00
So the hw_ctx is BLK_MQ_S_TAG_ACTIVE. And the nvme_ns.queue request queue
doesn't have QUEUE_FLAG_STOPPED set. nvme_rdma_ctrl.ctrl state is RECONNECTING.
More information about the Linux-nvme
mailing list