[PATCH 5.15.y] nvme-tcp: fix potential unbalanced freeze & unfreeze

Sagi Grimberg sagi at grimberg.me
Sun Aug 13 07:37:07 PDT 2023


Sorry Greg...

disregard these patches...
Thought that if they apply 6.1.y it would go further...

I'll just send a fresh series for each stable kernel.

On 8/13/23 17:31, Sagi Grimberg wrote:
> From: Ming Lei <ming.lei at redhat.com>
> 
> Move start_freeze into nvme_tcp_configure_io_queues(), and there is
> at least two benefits:
> 
> 1) fix unbalanced freeze and unfreeze, since re-connection work may
> fail or be broken by removal
> 
> 2) IO during error recovery can be failfast quickly because nvme fabrics
> unquiesces queues after teardown.
> 
> One side-effect is that !mpath request may timeout during connecting
> because of queue topo change, but that looks not one big deal:
> 
> 1) same problem exists with current code base
> 
> 2) compared with !mpath, mpath use case is dominant
> 
> Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic")
> Cc: stable at vger.kernel.org
> Signed-off-by: Ming Lei <ming.lei at redhat.com>
> Tested-by: Yi Zhang <yi.zhang at redhat.com>
> Reviewed-by: Sagi Grimberg <sagi at grimberg.me>
> Signed-off-by: Keith Busch <kbusch at kernel.org>
> ---
>   drivers/nvme/host/tcp.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 1dc7c733c7e3..8d67cdd844f5 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1884,6 +1884,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   		goto out_cleanup_connect_q;
>   
>   	if (!new) {
> +		nvme_start_freeze(ctrl);
>   		nvme_start_queues(ctrl);
>   		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
>   			/*
> @@ -1892,6 +1893,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   			 * to be safe.
>   			 */
>   			ret = -ENODEV;
> +			nvme_unfreeze(ctrl);
>   			goto out_wait_freeze_timed_out;
>   		}
>   		blk_mq_update_nr_hw_queues(ctrl->tagset,
> @@ -1996,7 +1998,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
>   	if (ctrl->queue_count <= 1)
>   		return;
>   	nvme_stop_admin_queue(ctrl);
> -	nvme_start_freeze(ctrl);
>   	nvme_stop_queues(ctrl);
>   	nvme_sync_io_queues(ctrl);
>   	nvme_tcp_stop_io_queues(ctrl);



More information about the Linux-nvme mailing list