[PATCH 0/3] avoid race between time out and tear down

Wed Oct 21 00:59:01 EDT 2020

>>> Avoid race between time out and tear down for rdma and tcp.
>>
>> This patchset overall looks good, but we still need the patch that
>> avoids double completion:
>>
>> -- 
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index 629b025685d1..46428ff0b0fc 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -2175,7 +2175,7 @@ static void nvme_tcp_complete_timed_out(struct 
>> request *rq)
>>          /* fence other contexts that may complete the command */
>>          mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
>>          nvme_tcp_stop_queue(ctrl, nvme_tcp_queue_id(req->queue));
>> -       if (!blk_mq_request_completed(rq)) {
>> +       if (blk_mq_request_started(rq) && 
>> !blk_mq_request_completed(rq)) {
> Yes, this patch is need. and samely for nvme_cancel_request.
> This will fix the race with asynchronous completion.
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index e85f6304efd7..1e838d952096 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -338,7 +338,7 @@ bool nvme_cancel_request(struct request *req, void 
> *data, bool reserved)
>                                  "Cancelling I/O %d", req->tag);
> 
>          /* don't abort one completed request */
> -       if (blk_mq_request_completed(req))
> +       if (blk_mq_request_completed(req) || !blk_mq_request_started(rq))
>                  return true;
> 
>          nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;

This one is unneeded because blk_mq_tagset_busy_iter checks that the
request has started...