[PATCH] nvme: fix (S)RCU protection of nvme_ns_head list (alternate)

Caleb Sander csander at purestorage.com
Thu Dec 1 13:17:52 PST 2022


On Wed, Nov 30, 2022 at 12:40 AM Sagi Grimberg <sagi at grimberg.me> wrote:
>
>
> >> I understand what you mean in general, but in this particular case
> >> I don't understand what is not working.
> >
> > How does this work?
>
> Particularly, the sleeping contexts are guaranteed not to dereference
> this NS after the two previous srcu synchronization steps so at this
> point, the only protection of nvme_ns_remove is against this
> non-sleepable traversal, which should be enough to protect with rcu.

Can you help me understand the safety here?
The namespace will be dereferenced when traversing the siblings list,
which is protected by SRCU.
But nvme_ns_remove() only synchronizes with RCU between removing the namespace
from the siblings list and freeing the namespace.
So it seems like there's a race here:
Thread A:                               Thread B:
nvme_ns_remove() executes
past the last synchronize_srcu()
                                        nvme_ns_head_submit_bio()
                                        calls srcu_read_lock(),
                                        starts traversing siblings list,
                                        holds pointer to ns
Removes ns from siblings list
Calls synchronize_rcu()
(does not block for the SRCU reader)
nvme_put_ns() reaches 0 references,
frees ns
                                        Dereferences ns to continue traversal
                                        => USE AFTER FREE



More information about the Linux-nvme mailing list