[PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs

Chao Leng lengchao at huawei.com
Thu Mar 18 01:51:14 GMT 2021



On 2021/3/18 2:43, Sagi Grimberg wrote:
> 
>>>>>> Will it work if nvme mpath used request NOWAIT flag for its submit_bio()
>>>>>> call, and add the bio to the requeue_list if blk_queue_enter() fails? I
>>>>>> think that looks like another way to resolve the deadlock, but we need
>>>>>> the block layer to return a failed status to the original caller.
>>>
>>> Yes, I think BLK_MQ_REQ_NOWAIT makes total sense here.  dm-mpath also
>>> uses it for its request allocation for similar reasons.
>>>
>>>>>
>>>>> But who would kick the requeue list? and that would make near-tag-exhaust performance stink...
>>>
>>> The multipath code would have to kick the list.  We could also try to
>>> split into two flags, one that affects blk_queue_enter and one that
>>> affects the tag allocation.
>>>
>>>> moving nvme_start_freeze from nvme_rdma_teardown_io_queues to nvme_rdma_configure_io_queues can fix it.
>>>> It can also avoid I/O hang long time if reconnection failed.
>>>
>>> Can you explain how we'd still ensure that no new commands get queued
>>> during teardown using that scheme?
>> 1. tear down will cancel all inflight requests, and then multipath will clear the path.
>> 2. and then we may freeze the controler.
>> 3. nvme_ns_head_submit_bio can not find the reconnection controller as valid path, so it is safe.
> 
> In non-mpath (which unfortunately is a valid use-case), there is no
> failover, and we cannot freeze the queue after we stopped (and/or
> started) the queues because then fail_non_ready_command() constantly return BLK_STS_RESOURCE (just causing a re-submission over and over
> again) and the freeze will never complete (the commands are still
> inflight from the queue->g_usage_counter perspective).
If the request set the flags to REQ_FAILFAST_xxx, will hang long time if reconnection failed.
This is not expected.
Another, If the controller is not live and the controller is freezed ,fast_io_fail_tmo will not work.
This is also not expected.
So I think freezing the controller when reconnecting is not good idea.
It's really not good behavior to try again and again. This is at least better than request hang long time.
> 
> So I think we should still start queue freeze before we quiesce
> the queues.
We should unquiesce and unfreeze the queues when reconnecting, otherwise fast_io_fail_tmo will not work.
> 
> I still don't see how the mpath NOWAIT suggestion works either...
mpath will queuue request to other live path or requeue the request(if no used path), so it will not wait.
> .



More information about the Linux-nvme mailing list