[PATCH 1/3] nvme-core: introduce sync io queues

Chao Leng lengchao at huawei.com
Tue Oct 20 22:10:29 EDT 2020



On 2020/10/21 1:06, Chaitanya Kulkarni wrote:
> On 10/20/20 02:12, Chao Leng wrote:
>> Introduce sync io queues for some scenarios which just only need sync
>> io queues not sync all queues.
>>
>> Signed-off-by: Chao Leng <lengchao at huawei.com>
> 
> Can you please explain the scenario in detail ?
It is used for avoid race between time out and tear down. see patch 2/3.
The race may cause abnormal:
1. Reported by Yi Zhang <yi.zhang at redhat.com>
detail: https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/
2. BUG_ON in blk_mq_requeue_request
Because error recovery and time out may repeated completion request.
First error recovery cancel request in tear down process, the request
will be retried in completion, rq->state will be changed to IDEL.
And then time out will complete the request again, and samely retry
the request, BUG_ON will happen in blk_mq_requeue_request.
3. abnormal link disconnection
Firt error recovery cancel all request, reconnect success, the request
will be restarted. And then time out will complete the request again,
the queue will be stoped in nvme_rdma(tcp)_complete_timed_out,
Abnormal link diconnection will happen. The condition(time out process
is delayed long time by some reason such as hardware interrupt) is need.
So the probability is low.
> 
> .
> 



More information about the Linux-nvme mailing list