[PATCH] nvme: fix SRCU protection of nvme_ns_head list

Paul E. McKenney paulmck at kernel.org
Mon Nov 21 16:25:37 PST 2022


On Mon, Nov 21, 2022 at 11:58:53AM -0800, Caleb Sander wrote:
> On Mon, Nov 21, 2022 at 9:59 AM Paul E. McKenney <paulmck at kernel.org> wrote:
> >
> > On Mon, Nov 21, 2022 at 09:48:31AM -0800, Caleb Sander wrote:
> > > On Sun, Nov 20, 2022 at 11:40 PM Christoph Hellwig <hch at lst.de> wrote:
> > > >
> > > > On Sun, Nov 20, 2022 at 01:24:51PM +0200, Sagi Grimberg wrote:
> > > > >>      sector_t capacity = get_capacity(head->disk);
> > > > >>      int node;
> > > > >> +    int srcu_idx;
> > > > >>   +  srcu_idx = srcu_read_lock(&head->srcu);
> > > > >>      list_for_each_entry_rcu(ns, &head->list, siblings) {
> > > > >>              if (capacity != get_capacity(ns->disk))
> > > > >>                      clear_bit(NVME_NS_READY, &ns->flags);
> > > > >>      }
> > > > >> +    srcu_read_unlock(&head->srcu, srcu_idx);
> > > > >
> > > > > I don't think you need srcu here, rcu_read_lock/unlock is sufficient.
> > > >
> > > > So the code obviously does not sleep.  But I wonder if anything speaks
> > > > against mixing SRCU and RCU protection for the same list?
> > >
> > > As I understand, synchronize_rcu() only synchronizes with RCU critical sections,
> > > and likewise synchronize_srcu() only synchronizes with SRCU critical sections.
> > > Currently nvme_ns_head_submit_bio() is using srcu_read_lock()
> > > but nvme_ns_remove() is using synchronize_rcu(),
> > > so there is no synchronization at all.
> > > Since nvme_ns_head_submit_bio() sleeps during the critical section,
> > > we definitely need to keep using SRCU there,
> > > and therefore nvme_ns_remove() needs to use synchronize_srcu().
> > > I agree RCU would work in nvme_mpath_revalidate_paths(), but as Paul points out,
> > > nvme_ns_remove() would then need to synchronize with both SRCU and RCU.
> > > This seems like it would complicate nvme_ns_remove()
> > > and require waiting on readers of unrelated RCU-protected structures.
> > > Is there some cost to using SRCU over RCU that I am missing?
> >
> > The cost is that srcu_read_lock() and srcu_read_unlock() both invoke
> > smp_mb().  This might or might not be a problem for you, depending on
> > how fast your fastpaths need to be.  Another cost is the need to pass
> > the return value of srcu_read_lock() to srcu_read_unlock(), which is
> > usually more of a software-engineering cost than it is a performance cost.
> 
> Thanks, that's great to know! I think this is probably not performance-critical,
> since it's called only for changes in the connected NVMe namespaces.
> 
> > Of course, the easiest thing is to try it both ways and see if you can
> > measure the difference.  If you are unsure whether or not it is safe
> > to replace rcu_read_lock() with srcu_read_lock(), you can just add
> > smp_mb() after each rcu_read_lock() and before each rcu_read_unlock().
> > That won't be exactly, but it will get you quite close to the cost of
> > srcu_read_lock() and srcu_read_unlock().
> 
> Regarding your suggestion to use synchronize_rcu_mult() to wait concurrently,
> I'm not sure it's possible to write an equivalent to call_my_srrcu() here,
> since the srcu_struct is per-nvme_ns_head rather than global.
> I think we would probably need to synchronize with RCU and SRCU sequentially.

That might be a reason to use a global srcu_struct.  Though there might
be good reasons not to, for all I know.

							Thanx, Paul



More information about the Linux-nvme mailing list