[PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs

Christoph Hellwig hch at lst.de
Thu Mar 18 04:45:39 GMT 2021


On Thu, Mar 18, 2021 at 09:51:14AM +0800, Chao Leng wrote:
>>>> The multipath code would have to kick the list.  We could also try to
>>>> split into two flags, one that affects blk_queue_enter and one that
>>>> affects the tag allocation.
>>>>
>>>>> moving nvme_start_freeze from nvme_rdma_teardown_io_queues to nvme_rdma_configure_io_queues can fix it.
>>>>> It can also avoid I/O hang long time if reconnection failed.
>>>>
>>>> Can you explain how we'd still ensure that no new commands get queued
>>>> during teardown using that scheme?
>>> 1. tear down will cancel all inflight requests, and then multipath will clear the path.
>>> 2. and then we may freeze the controler.
>>> 3. nvme_ns_head_submit_bio can not find the reconnection controller as valid path, so it is safe.
>>
>> In non-mpath (which unfortunately is a valid use-case), there is no
>> failover, and we cannot freeze the queue after we stopped (and/or
>> started) the queues because then fail_non_ready_command() constantly return BLK_STS_RESOURCE (just causing a re-submission over and over
>> again) and the freeze will never complete (the commands are still
>> inflight from the queue->g_usage_counter perspective).
> If the request set the flags to REQ_FAILFAST_xxx, will hang long time if reconnection failed.
> This is not expected.
> Another, If the controller is not live and the controller is freezed ,fast_io_fail_tmo will not work.
> This is also not expected.
> So I think freezing the controller when reconnecting is not good idea.
> It's really not good behavior to try again and again. This is at least better than request hang long time.

Well, it is pretty clear that REQ_FAILFAST_* (and I'm still confused
about the three variants of that) should not block in blk_queue_enter,
and we should make sure nvme-multipath triggers that.  Let me thing
of a good way to refactor blk_queue_enter first to make that least
painful.

>> So I think we should still start queue freeze before we quiesce
>> the queues.
> We should unquiesce and unfreeze the queues when reconnecting, otherwise fast_io_fail_tmo will not work.
>>
>> I still don't see how the mpath NOWAIT suggestion works either...
> mpath will queuue request to other live path or requeue the request(if no used path), so it will not wait.
>> .

Yes.



More information about the Linux-nvme mailing list