[PATCH v3 3/5] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command

Thu Jan 21 20:50:16 EST 2021


On 2021/1/21 17:27, Hannes Reinecke wrote:
> On 1/21/21 10:00 AM, Christoph Hellwig wrote:
>> On Thu, Jan 21, 2021 at 09:58:37AM +0100, Hannes Reinecke wrote:
>>> On 1/21/21 8:03 AM, Chao Leng wrote:
>>>> When reconnect, the request may be completed with NVME_SC_HOST_PATH_ERROR
>>>> in nvmf_fail_nonready_command. The state of request will be changed to
>>>> MQ_RQ_IN_FLIGHT before call nvme_complete_rq. If free the request
>>>> asynchronously such as in nvme_submit_user_cmd, in extreme scenario
>>>> the request may be completed again in tear down process.
>>>> nvmf_fail_nonready_command do not need calling blk_mq_start_request
>>>> before complete the request. nvmf_fail_nonready_command should set
>>>> the state of request to MQ_RQ_COMPLETE before complete the request.
>>>>
>>>
>>> So what you are saying is that there is a race condition between
>>> blk_mq_start_request()
>>> and
>>> nvme_complete_request()
>>
>> Between those to a teardown that cancels all requests can come in.
>>
> Doesn't nvme_complete_request() insulate against a double completion?
nvme_complete_request can not insulate against double completion.
Setting the state of request to MQ_RQ_COMPLETE avoid double completion.
tear down(nvme_cancel_request) check the state of the request, if the
state is MQ_RQ_COMPLETE, it will skip completion.
> I seem to remember we've gone through great lengths ensuring that.
> 
> And if this is just about setting the correct error code on completion I'd really prefer to stick with the current code. Moving that into a helper is fine, but I'd rather not introduce our own code modifying request state.
> 
> If there really is a race condition this feels like a more generic problem; calling blk_mq_start_request() followed by blk_mq_end_request() is a quite common pattern, and from my impression the recommended way.
> So if there is an issue it would need to be addressed for all drivers, not just some nvme-specific way.
Currently, it is not safe for nvme. The probability is very low.
I am not sure whether similar occurs in other scenarios.
> Plus I'd like to have Jens' opinion here.
> 
> Cheers,
> 
> Hannes