NVMeoF: multipath stuck after bringing one ethernet port down

Sagi Grimberg sagi at grimberg.me
Mon Jun 5 01:53:58 PDT 2017


>> [PATCH] nvme-rdma: fast fail incoming requests while we reconnect
>>
>> When we encounter an transport/controller errors, error recovery
>> kicks in which performs:
>> 1. stops io/admin queues
>> 2. moves transport queues out of LIVE state
>> 3. fast fail pending io
>> 4. schedule periodic reconnects.
>>
>> But we also need to fast fail incoming IO taht enters after we
>> already scheduled. Given that our queue is not LIVE anymore, simply
>> restart the request queues to fail in .queue_rq
> 
> But we shouldn't _fail_ I/O just because we're reconnecting, we
> need to be able to retry it once reconnected.

I'm not sure, the point is to fail fast so that dm or user can
failover traffic. Besides, we iterate and cancel all inflight IO, this
attempts to give the same treatment to IO that arrives later...

In several scsi transports, we have the concept of fast_io_fail_tmo
which we could add to nvme, but from my experience, people usually set
it to a minimum to achieve fast failover (usually smaller than the very
first reconnect attempt).

We have nvme_max_retries modparam, so we could simply fail it fast until
we hit this modparam, but I suspect it'll expire very fast.

> 
>> +                   cmd->fabrics.fctype != nvme_fabrics_type_connect) {
>> +                       if (queue->ctrl->ctrl->state ==
>> NVME_CTRL_RECONNECTING)
>> +                               return -EIO;
>> +                       else
>> +                               return -EAGAIN;
>> +               }
> 
> So this looks somewhat bogus to me, while the rest looks ok.

The point here is that RECONNECTING is a ctrl state that has a
potential to linger for a long time (unlike RESETTING or DELETING),
so we don't want to trigger requeue right away.

I'm open to other ideas. I just want to prevent triggering a redundant
loop of queue_rq -> fail with BUSY -> queue_rq -> fail with BUSY ...

Thoughts?



More information about the Linux-nvme mailing list