[PATCH v2 5/5] nvme-fc: Freeze queues before destroying them

James Smart jsmart2021 at gmail.com
Fri Jul 9 09:14:07 PDT 2021


On 7/8/2021 2:27 AM, Daniel Wagner wrote:
> nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
> paired with a nvme_start_freeze(). Without freezing first we will always
> timeout in nvme_wait_freeze_timeout().
> 
> Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
> controller reset hang during traffic") which happens to follow the PCI
> strategy how to handle resetting the queues.
> 
> Signed-off-by: Daniel Wagner <dwagner at suse.de>
> ---
>   drivers/nvme/host/fc.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8e1fc3796735..a38b01485939 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
>   		nvme_fc_xmt_ls_rsp(disls);
>   
>   	if (ctrl->ctrl.tagset) {
> +		nvme_start_freeze(&ctrl->ctrl);
>   		nvme_fc_delete_hw_io_queues(ctrl);
>   		nvme_fc_free_io_queues(ctrl);
>   	}
> 

Thanks for the note. that definitely helped follow what is being 
attempted. I also agree with Hannes that the comment from the rdma patch 
should also be present to understand what's going on.

Looking at the patch - this is not done in the same place or manner as 
rdma. Freezing and stoppage is prior to cancelling and that doesn't 
correspond where this was added (this is after all cancellations). We 
also seem to be missing a nvme_sync_io_queues() call in the sequence as 
well. So I believe there's more work to be done on this patch.  I'll see 
what I can do.

We really need to see about a common layer for transports. So much we do 
is similar. We were ok at the start, but we've drifted apart over time 
and the requirements to the core layer aren't propogating to all transports.

-- james



More information about the Linux-nvme mailing list