[RFC PATCH] nvme: fix RCU hole that allowed for endless looping in multipath round robin

Chris Leech cleech at redhat.com
Tue Apr 5 10:25:33 PDT 2022


On Fri, Mar 25, 2022 at 5:42 AM Sagi Grimberg <sagi at grimberg.me> wrote:
>
> On 3/22/22 00:43, Chris Leech wrote:
> > Make nvme_ns_remove match the assumptions elsewhere.
> >
> > 1) !NVME_NS_READY needs to be srcu synchronized to make sure nothing is
> >     running in __nvme_find_path or nvme_round_robin_path that will
> >     re-assign this ns to current_path.
> >
> > 2) Any matching current_path entries need to be cleared before removing
> >     from the siblings list, to prevent calling nvme_round_robin_path with
> >     an "old" ns that's off list.
> >
> > 3) Finally the list_del_rcu can happen, and then synchronize again
> >     before releasing any reference counts.
> > ---
> >   drivers/nvme/host/core.c | 13 +++++++++----
> >   1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index fd4720d37cc0..20778dc9224c 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -3917,6 +3917,15 @@ static void nvme_ns_remove(struct nvme_ns *ns)
> >       set_capacity(ns->disk, 0);
> >       nvme_fault_inject_fini(&ns->fault_inject);
> >
> > +     /* ensure that !NVME_NS_READY is seen
> > +      * to prevent this ns going back in current_path
> > +      */
> > +     synchronize_srcu(&ns->head->srcu);
> > +
> > +     /* wait for concurrent submissions */
> > +     if (nvme_mpath_clear_current_path(ns))
> > +             synchronize_srcu(&ns->head->srcu);
>
> Can you describe how the unconditional srcu is protecting
> something that is not protected with the one under clearing
> the current_path?

My concern there was if nvme_mpath_clear_current_path was run while a
different CPU was in __nvme_find_path or nvme_round_robin_path, beyond
an nvme_path_is_disabled check but before rcu_assign_pointer, then the
ns being removed could be re-assigned to current_path.

Apologies for not noticing this question before now.

- Chris




More information about the Linux-nvme mailing list