[PATCH] nvme: fix SRCU protection of nvme_ns_head list

Paul E. McKenney paulmck at kernel.org
Mon Nov 21 09:59:32 PST 2022


On Mon, Nov 21, 2022 at 09:48:31AM -0800, Caleb Sander wrote:
> On Sun, Nov 20, 2022 at 11:40 PM Christoph Hellwig <hch at lst.de> wrote:
> >
> > On Sun, Nov 20, 2022 at 01:24:51PM +0200, Sagi Grimberg wrote:
> > >>      sector_t capacity = get_capacity(head->disk);
> > >>      int node;
> > >> +    int srcu_idx;
> > >>   +  srcu_idx = srcu_read_lock(&head->srcu);
> > >>      list_for_each_entry_rcu(ns, &head->list, siblings) {
> > >>              if (capacity != get_capacity(ns->disk))
> > >>                      clear_bit(NVME_NS_READY, &ns->flags);
> > >>      }
> > >> +    srcu_read_unlock(&head->srcu, srcu_idx);
> > >
> > > I don't think you need srcu here, rcu_read_lock/unlock is sufficient.
> >
> > So the code obviously does not sleep.  But I wonder if anything speaks
> > against mixing SRCU and RCU protection for the same list?
> 
> As I understand, synchronize_rcu() only synchronizes with RCU critical sections,
> and likewise synchronize_srcu() only synchronizes with SRCU critical sections.
> Currently nvme_ns_head_submit_bio() is using srcu_read_lock()
> but nvme_ns_remove() is using synchronize_rcu(),
> so there is no synchronization at all.
> Since nvme_ns_head_submit_bio() sleeps during the critical section,
> we definitely need to keep using SRCU there,
> and therefore nvme_ns_remove() needs to use synchronize_srcu().
> I agree RCU would work in nvme_mpath_revalidate_paths(), but as Paul points out,
> nvme_ns_remove() would then need to synchronize with both SRCU and RCU.
> This seems like it would complicate nvme_ns_remove()
> and require waiting on readers of unrelated RCU-protected structures.
> Is there some cost to using SRCU over RCU that I am missing?

The cost is that srcu_read_lock() and srcu_read_unlock() both invoke
smp_mb().  This might or might not be a problem for you, depending on
how fast your fastpaths need to be.  Another cost is the need to pass
the return value of srcu_read_lock() to srcu_read_unlock(), which is
usually more of a software-engineering cost than it is a performance cost.

Of course, the easiest thing is to try it both ways and see if you can
measure the difference.  If you are unsure whether or not it is safe
to replace rcu_read_lock() with srcu_read_lock(), you can just add
smp_mb() after each rcu_read_lock() and before each rcu_read_unlock().
That won't be exactly, but it will get you quite close to the cost of
srcu_read_lock() and srcu_read_unlock().

							Thanx, Paul



More information about the Linux-nvme mailing list