[PATCH v0 5/6] nvme-fc: redesign locking and refcounting

Hannes Reinecke hare at suse.de
Fri Feb 16 03:09:20 PST 2024


On 2/16/24 09:45, Daniel Wagner wrote:
> The life time of the controller is managed by the upper layers.
> 
> Thus just ref counting the controller when creating it and giving the
> ref back on the cleanup path. This is how the other transport are
> managed as well. Until now, the ref count has been taken per LS request
> which is not really necessary as the core guarantees that there is no in
> flight request when shuting down (if we use the nvme APIs are used
> correctly).
> 
> In fact we don't really need the ref count for nvme_fc_ctrl at this
> point. Though, the FC transport is offloading the connect attempt to a
> workqueue and in the next patch we introduce a sync option for which the
> ref counter is necessary. So let's keep it around.
> 
> Also take a ref for lport and rport when creating the controller and
> give it back when we destroy the controller. This means these refs are
> tied to the life time of the controller and not the other way around.
> 
> We have also to reorder the cleanup code in nvme_fc_delete_ctrl and
> nvme_fc_free_ctrl so that we do not expose resources too long and run
> into use after free situations which are currently possible.
> 
> Signed-off-by: Daniel Wagner <dwagner at suse.de>
> ---
>   drivers/nvme/host/fc.c | 136 +++++++++++++----------------------------
>   1 file changed, 41 insertions(+), 95 deletions(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index ddbc5b21af5b..7f9edab57550 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -229,6 +229,9 @@ static struct device *fc_udev_device;
>   
>   static void nvme_fc_complete_rq(struct request *rq);
>   
> +static void nvme_fc_ctrl_put(struct nvme_fc_ctrl *);
> +static int nvme_fc_ctrl_get(struct nvme_fc_ctrl *);
> +
>   /* *********************** FC-NVME Port Management ************************ */
>   
>   static void __nvme_fc_delete_hw_queue(struct nvme_fc_ctrl *,
> @@ -800,7 +803,7 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl)
>   			dev_warn(ctrl->ctrl.device,
>   				"NVME-FC{%d}: Couldn't schedule reset.\n",
>   				ctrl->cnum);
> -			nvme_delete_ctrl(&ctrl->ctrl);
> +			nvme_fc_ctrl_put(ctrl);
>   		}
>   		break;
>   
> @@ -868,7 +871,7 @@ nvme_fc_unregister_remoteport(struct nvme_fc_remote_port *portptr)
>   			dev_warn(ctrl->ctrl.device,
>   				"NVME-FC{%d}: controller connectivity lost.\n",
>   				ctrl->cnum);
> -			nvme_delete_ctrl(&ctrl->ctrl);
> +			nvme_fc_ctrl_put(ctrl);
>   		} else
>   			nvme_fc_ctrl_connectivity_loss(ctrl);
>   	}
> @@ -1022,9 +1025,6 @@ fc_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>   
>   /* *********************** FC-NVME LS Handling **************************** */
>   
> -static void nvme_fc_ctrl_put(struct nvme_fc_ctrl *);
> -static int nvme_fc_ctrl_get(struct nvme_fc_ctrl *);
> -
>   static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg);
>   
>   static void
> @@ -1050,8 +1050,6 @@ __nvme_fc_finish_ls_req(struct nvmefc_ls_req_op *lsop)
>   	fc_dma_unmap_single(rport->dev, lsreq->rqstdma,
>   				  (lsreq->rqstlen + lsreq->rsplen),
>   				  DMA_BIDIRECTIONAL);
> -
> -	nvme_fc_rport_put(rport);
>   }
>   
Hmm. I'm a bit unsure about this; essentially you change the rport 
refcounting (and not just the controller refcounting).
And the problem here is that rport refcounting is actually tied to
the driver-internal rports, which have a different lifetime
(dev_loss_tmo and all that).

Would it be possible to break this in two, with one patch changing the 
controller/options refcounting and the other one changing the rport 
refcounting?

Cheers,

Hannes




More information about the Linux-nvme mailing list