NVMeoF: multipath stuck after bringing one ethernet port down
Christoph Hellwig
hch at lst.de
Mon Jun 5 01:40:37 PDT 2017
On Tue, May 30, 2017 at 05:17:40PM +0300, Sagi Grimberg wrote:
> [PATCH] nvme-rdma: fast fail incoming requests while we reconnect
>
> When we encounter an transport/controller errors, error recovery
> kicks in which performs:
> 1. stops io/admin queues
> 2. moves transport queues out of LIVE state
> 3. fast fail pending io
> 4. schedule periodic reconnects.
>
> But we also need to fast fail incoming IO taht enters after we
> already scheduled. Given that our queue is not LIVE anymore, simply
> restart the request queues to fail in .queue_rq
But we shouldn't _fail_ I/O just because we're reconnecting, we
need to be able to retry it once reconnected.
> + cmd->fabrics.fctype != nvme_fabrics_type_connect) {
> + if (queue->ctrl->ctrl->state ==
> NVME_CTRL_RECONNECTING)
> + return -EIO;
> + else
> + return -EAGAIN;
> + }
So this looks somewhat bogus to me, while the rest looks ok.
More information about the Linux-nvme
mailing list