[PATCH 0/3] avoid race between time out and tear down
Chao Leng
lengchao at huawei.com
Wed Oct 21 21:58:25 EDT 2020
On 2020/10/21 16:53, Sagi Grimberg wrote:
>
>>>>>> Avoid race between time out and tear down for rdma and tcp.
>>>>>
>>>>> This patchset overall looks good, but we still need the patch that
>>>>> avoids double completion:
>>>>>
>>>>> --
>>>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>>>> index 629b025685d1..46428ff0b0fc 100644
>>>>> --- a/drivers/nvme/host/tcp.c
>>>>> +++ b/drivers/nvme/host/tcp.c
>>>>> @@ -2175,7 +2175,7 @@ static void nvme_tcp_complete_timed_out(struct request *rq)
>>>>> /* fence other contexts that may complete the command */
>>>>> mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
>>>>> nvme_tcp_stop_queue(ctrl, nvme_tcp_queue_id(req->queue));
>>>>> - if (!blk_mq_request_completed(rq)) {
>>>>> + if (blk_mq_request_started(rq) && !blk_mq_request_completed(rq)) {
>>>> Yes, this patch is need. and samely for nvme_cancel_request.
>>>> This will fix the race with asynchronous completion.
>>>>
>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>> index e85f6304efd7..1e838d952096 100644
>>>> --- a/drivers/nvme/host/core.c
>>>> +++ b/drivers/nvme/host/core.c
>>>> @@ -338,7 +338,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>>>> "Cancelling I/O %d", req->tag);
>>>>
>>>> /* don't abort one completed request */
>>>> - if (blk_mq_request_completed(req))
>>>> + if (blk_mq_request_completed(req) || !blk_mq_request_started(rq))
>>>> return true;
>>>>
>>>> nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;
>>>
>>> This one is unneeded because blk_mq_tagset_busy_iter checks that the
>>> request has started...
>> Yes, it is already checked.
>
> Can you add the attached patches and resend?
>
> You can also add my:
> Reviewed-by: Sagi Grimberg <sagi at grimberg.me>
ok, I will do it later.
More information about the Linux-nvme
mailing list