[PATCH 1/3] nvme-tcp: Do not terminate commands when in RESETTING
Hannes Reinecke
hare at suse.de
Tue Sep 12 05:06:36 PDT 2023
On 9/12/23 13:54, Sagi Grimberg wrote:
>
>> When the error recovery started it will terminate all commands
>> anyway, so we should just return BLK_EH_RESET_TIMER for a
>> command timeout.
>
> Does this fix a real issue? the timeout handler and the error
> recovery should be mutually exclusive.
>
They are not. When several timeouts are triggering at the same time,
only the first will start error recovery; the remaining ones will
still hit the timeout function until cancel_tagset() is called.
>> Signed-off-by: Hannes Reinecke <hare at suse.de>
>> ---
>> drivers/nvme/host/tcp.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>> index 5b332d9f87fc..1faef1cf5c94 100644
>> --- a/drivers/nvme/host/tcp.c
>> +++ b/drivers/nvme/host/tcp.c
>> @@ -2274,6 +2274,14 @@ static enum blk_eh_timer_return
>> nvme_tcp_timeout(struct request *rq)
>> nvme_tcp_queue_id(req->queue), nvme_cid(rq), pdu->hdr.type,
>> opc, nvme_opcode_str(qid, opc, fctype));
>> + /*
>> + * If the error recovery is started all commands will be
>> + * aborted anyway, and nothing is to be done here.
>> + */
>> + if (ctrl->state == NVME_CTRL_RESETTING &&
>> + work_pending(&to_tcp_ctrl(ctrl)->err_work))
>> + return BLK_EH_RESET_TIMER;
>
> What does the work_pending() check give you here?
>
'RESETTING' is also entered for firmware download, so we need to check
for work_pending() to find out if the error recovery engaged.
Cheers,
Hannes
More information about the Linux-nvme
mailing list