[PATCH] nvme: fix SRCU protection of nvme_ns_head list

Paul E. McKenney paulmck at kernel.org
Thu Dec 1 15:18:04 PST 2022


On Thu, Dec 01, 2022 at 01:27:16PM -0800, Caleb Sander wrote:
> On Tue, Nov 22, 2022 at 7:08 AM Sagi Grimberg <sagi at grimberg.me> wrote:
> > ...
> > The original patch report did not include any sequence that removes all
> > namespaces, and given that it came from RockyLinux 8.6 kernel, it is not
> > 6.1... Hence I think that we need to understand how a namespace removal
> > happened at the same time that the namespace is being scanned. Maybe
> > something else is broken.
> 
> After some more debugging, I found the main cause:
> ns->disk isn't set until after ns is linked into the ns_head list.
> The ns is linked in nvme_init_ns_head(),
> which nvme_alloc_ns() calls before creating the disk:
> static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
> struct nvme_ns_ids *ids)
> {
> // ...
> if (nvme_init_ns_head(ns, nsid, ids, id->nmic & NVME_NS_NMIC_SHARED))
> goto out_free_queue;
> 
> disk = alloc_disk_node(0, node);
> if (!disk)
> goto out_unlink_ns;
> // ...
> ns->disk = disk;
> // ...
> }
> 
> It looks like this has never been a problem upstream:
> 5f432cceb3e9 ("nvme: use blk_mq_alloc_disk") fixed the order before
> e7d65803e2bb ("nvme-multipath: revalidate paths during rescan") was committed.
> Strange that the Red Hat 8.6 kernel has pulled in only the later commit.

Sadly, not strange at all.  Keeping upstream sane is hard enough, and
also keeping N -stable releases sane is not any easier.  ;-)

> The list traversal here still needs to be protected by (S)RCU,
> so I would still push for this patch.
> Sagi had suggested using RCU here instead, but discussing with Chris,
> it sounds like we reached a consensus to consistently protect this list
> with SRCU. We can evaluate switching to RCU later if needed,
> but since this is not in the I/O path, I don't see a big performance concern.

For whatever it is worth, makes sense to me!

							Thanx, Paul



More information about the Linux-nvme mailing list