[PATCH v2] nvme: rdma/tcp: call nvme_mpath_stop() from reconnect workqueue

Sagi Grimberg sagi at grimberg.me
Sat Apr 24 01:21:03 BST 2021


> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index be905d4fdb47..fc07a7b0dc1d 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1202,6 +1202,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
>   		return;
>   	}
>   
> +	nvme_mpath_stop(&ctrl->ctrl);
>   	nvme_rdma_reconnect_or_remove(ctrl);

Its pretty annoying to have to needlessly wait for the ana log page
request to timeout... But this is also needed because we init ana_lock
in nvme_mpath_init while it can be potentially taken in ana_work, which
is a precipice for bad things...

So...

Reviewed-by: Sagi Grimberg <sagi at grimberg.me>

And this needs to go to stable, don't see a definitive offending
commit here, so maybe just CC stable and have the title be
more discriptive that it fixes a possible list-poison?

Something like:
nvme: rdma/tcp: fix a possible list corruption when reconnect work is 
racing with anatt timer




More information about the Linux-nvme mailing list