[PATCHv2] nvme-tcp: Do not terminate i/o commands during RESETTING
Max Gurtovoy
mgurtovoy at nvidia.com
Sat Jan 13 18:53:29 PST 2024
Hi Hannes,
On 12/01/2024 12:09, hare at kernel.org wrote:
> From: Hannes Reinecke <hare at suse.de>
>
> Terminating commands from the timeout handler might lead
> to a data corruption as the timeout might trigger before
> KATO expired.
Can you please explain the data corruption and how this patch is fixing it ?
> When several commands have been sent in a batch and
> the command timeouts trigger just after the keep-alive
> command has been sent then the first command will trigger
> the error recovery. But all other commands will timeout
> directly afterwards and will hit the timeout handler
> before the err_work workqueue handler has started.
> This results in these commands being aborted and
> immediately retried without waiting for KATO.
> So return BLK_EH_RESET_TIMER for I/O commands when
> the controller is in 'RESETTING' or 'DELETING'
> state to ensure that the commands will
> be retried only after the KATO interval.
I'm not sure I understand how does KATO and reconnect_delay are related ?
>
> Signed-off-by: Hannes Reinecke <hare at suse.de>
> ---
> drivers/nvme/host/tcp.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 08805f027810..9dcb2d3b123c 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2431,17 +2431,27 @@ static enum blk_eh_timer_return nvme_tcp_timeout(struct request *rq)
> struct nvme_tcp_cmd_pdu *pdu = nvme_tcp_req_cmd_pdu(req);
> u8 opc = pdu->cmd.common.opcode, fctype = pdu->cmd.fabrics.fctype;
> int qid = nvme_tcp_queue_id(req->queue);
> + enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
>
> dev_warn(ctrl->device,
> - "queue %d: timeout cid %#x type %d opcode %#x (%s)\n",
> + "queue %d: timeout cid %#x type %d opcode %#x (%s) state %d\n",
> nvme_tcp_queue_id(req->queue), nvme_cid(rq), pdu->hdr.type,
> - opc, nvme_opcode_str(qid, opc, fctype));
> + opc, nvme_opcode_str(qid, opc, fctype), state);
> +
> + /*
> + * If the controller is in state RESETTING or DELETING all
> + * inflight commands will be terminated soon which in turn
> + * may failover to a different path.
> + */
> + if ((state == NVME_CTRL_RESETTING ||
> + state == NVME_CTRL_DELETING) && qid > 0)
> + return BLK_EH_RESET_TIMER;
>
> - if (nvme_ctrl_state(ctrl) != NVME_CTRL_LIVE) {
> + if (state != NVME_CTRL_LIVE) {
> /*
> - * If we are resetting, connecting or deleting we should
> - * complete immediately because we may block controller
> - * teardown or setup sequence
> + * If the controller is not live we should complete
> + * immediately because we may block controller teardown
> + * or setup sequence
> * - ctrl disable/shutdown fabrics requests
> * - connect requests
> * - initialization admin requests
More information about the Linux-nvme
mailing list