[PATCH 1/3] nvme-tcp: Do not terminate commands when in RESETTING

Hannes Reinecke hare at suse.de
Tue Sep 12 05:06:36 PDT 2023


On 9/12/23 13:54, Sagi Grimberg wrote:
> 
>> When the error recovery started it will terminate all commands
>> anyway, so we should just return BLK_EH_RESET_TIMER for a
>> command timeout.
> 
> Does this fix a real issue? the timeout handler and the error
> recovery should be mutually exclusive.
> 
They are not. When several timeouts are triggering at the same time,
only the first will start error recovery; the remaining ones will
still hit the timeout function until cancel_tagset() is called.

>> Signed-off-by: Hannes Reinecke <hare at suse.de>
>> ---
>>   drivers/nvme/host/tcp.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index 5b332d9f87fc..1faef1cf5c94 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -2274,6 +2274,14 @@ static enum blk_eh_timer_return 
>> nvme_tcp_timeout(struct request *rq)
>>           nvme_tcp_queue_id(req->queue), nvme_cid(rq), pdu->hdr.type,
>>           opc, nvme_opcode_str(qid, opc, fctype));
>> +    /*
>> +     * If the error recovery is started all commands will be
>> +     * aborted anyway, and nothing is to be done here.
>> +     */
>> +    if (ctrl->state == NVME_CTRL_RESETTING &&
>> +        work_pending(&to_tcp_ctrl(ctrl)->err_work))
>> +        return BLK_EH_RESET_TIMER;
> 
> What does the work_pending() check give you here?
> 
'RESETTING' is also entered for firmware download, so we need to check
for work_pending() to find out if the error recovery engaged.

Cheers,

Hannes




More information about the Linux-nvme mailing list