[PATCH] nvme-core: warn and abort if shared ns with empty list

irvin cote irvincoteg at gmail.com
Tue Jun 13 12:07:35 PDT 2023


I have investigate the question a bit more and I have come to make
these three observations:

In case of a hot unplug :

It looks like in the case of a hot unplug the ns heads that have an
empty list are removed from the subsystem's list of nsheads regardless
of the head's refcount; as per nvme_ns_remove. Therefore after a
hot-unplug then replug we should not see stale ns heads from before.
Even if we did see stale ns heads from before the unplug then the new
head might not be added because of nvme_subsys_check_duplicate_ids
failing


Why I think we only have shared namespaces when reaching this location:

In nvme_find_ns_head, when we check for the emptiness of the head's
list (meaning we've asserted (h->ns_id == nsid &&
nvme_is_unique_nsid(ctrl, h)) we should only have shared namespaces
because otherwise we would have two private namespace with the same
nsid while asserting nvme_is_unique_nsid.


The refcount problem:

If the list is empty and the refcount is not 0 we will take a
reference on the head that we will never put back.

Hence I think we should either erase the check for the head's list
emptiness if the condition never occurs or if it does, warn and exit.

On Wed, 7 Jun 2023 at 11:40, irvin cote <irvincoteg at gmail.com> wrote:
>
> But in the case you are describing, wouldn't it fail at
> nvme_subsys_check_duplicate_ids?
>
> On Tue, 6 Jun 2023 at 19:55, Keith Busch <kbusch at kernel.org> wrote:
> >
> > On Tue, Jun 06, 2023 at 10:21:34AM +0200, Irvin Cote wrote:
> > > In nvme_find_ns_head if we find a shared ns but it has an empty
> > > list or its refcount is 0 we gloss over it and go to the next.
> > > Ignoring when such a case happens means that when nvme_find_ns_head
> > > returns NULL, we could create a new nshead with the same nsid. Hence
> > > we could find ourselves with two shared nsheads having the same nsid.
> > > Instead warn the user through dev_err and exit.
> >
> > Let's say you're holding a reference on the old "head" and then hot
> > remove the drive. The kernel will tear down the device and its
> > namespace, but the subsystem and empty head will remain due to that
> > held reference.
> >
> > Now plug the drive back in. Today, the controller will attach to the
> > previous subsystem, and make a new "head" for its namespace (because the
> > old one is empty), and you can continue from there, though the new
> > handle names may look a bit odd.
> >
> > This change won't make the re-added namespace visible to the user, which
> > doesn't sound right.
> >
> > > Signed-off-by: Irvin Cote <irvincoteg at gmail.com>
> > > ---
> > >  drivers/nvme/host/core.c | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > >
> > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > index c8a471482421..175f1516083f 100644
> > > --- a/drivers/nvme/host/core.c
> > > +++ b/drivers/nvme/host/core.c
> > > @@ -3341,6 +3341,11 @@ static struct nvme_ns_head *nvme_find_ns_head(struct nvme_ctrl *ctrl,
> > >                       continue;
> > >               if (!list_empty(&h->list) && nvme_tryget_ns_head(h))
> > >                       return h;
> > > +             else {
> > > +                     dev_err(ctrl->device,
> > > +                             "Found shared namespace with empty list or 0 refcount\n");
> >
> > I don't think we're guaranteed this path is using a "shared" namespace.
> >
> > > +                     return ERR_PTR(-ENOENT);
> > > +             }
> > >       }
> > >
> > >       return NULL;



More information about the Linux-nvme mailing list