[PATCH v2] nvme: rdma/tcp: call nvme_mpath_stop() from reconnect workqueue

Martin Wilck mwilck at suse.com
Tue Apr 27 10:04:13 BST 2021


On Fri, 2021-04-23 at 17:21 -0700, Sagi Grimberg wrote:
> 
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > index be905d4fdb47..fc07a7b0dc1d 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -1202,6 +1202,7 @@ static void
> > nvme_rdma_error_recovery_work(struct work_struct *work)
> >                 return;
> >         }
> >   
> > +       nvme_mpath_stop(&ctrl->ctrl);
> >         nvme_rdma_reconnect_or_remove(ctrl);
> 
> Its pretty annoying to have to needlessly wait for the ana log page
> request to timeout..
>  But this is also needed because we init ana_lock
> in nvme_mpath_init while it can be potentially taken in ana_work,
> which
> is a precipice for bad things...

What if we move the ana-related fields into a separately kmalloc'd
structure, rather than embedding it in nvme_ctrl? That way we could
just detach it when the controller is re-initialized, without needing
to wait. When the timeouts eventually occur, the worker func would
realize that it's detached and just do nothing.

Regards
Martin





More information about the Linux-nvme mailing list