target crash / host hang with nvme-all.3 branch of nvme-fabrics

Steve Wise swise at opengridcomputing.com
Tue Jun 28 09:31:27 PDT 2016


> On Tue, Jun 28, 2016 at 09:15:22AM -0500, Steve Wise wrote:
> > I'm not so sure.  I don't see where nvmet leaves unsignaled wrs on the SQ.
> > It either posts chains via RDMA-RW and the last in the chain is always
> > signaled (I think), or it posts signaled IO responses.
> 
> Indeed.  So we need to figure out where we don't release a rsp.
> 

Hey Ming, 

For what its worth, the change you proposed in this thread isn't working for me.
I see maybe one or two recoveries successful, then the target gets stuck.  I see
several workq threads stuck destroying various qps, one thread stuck draining a
qp.  If this change is not the proper fix, then I'm not going to debug this
further.





More information about the Linux-nvme mailing list