[PATCH 2/2] nvme: fix atomic write size validation
Yi Zhang
yi.zhang at redhat.com
Thu Jun 26 17:18:46 PDT 2025
On Fri, Jun 27, 2025 at 8:05 AM Uday Shankar <ushankar at purestorage.com> wrote:
>
> On Wed, Jun 25, 2025 at 08:39:56AM +0200, Christoph Hellwig wrote:
> > Don't mix the namespace and controller values, and validate the
> > per-controller limit when probing the controller. This avoid spurious
> > failures for controllers with namespaces that have different namespaces
>
> nit: having namespaces with different logical block sizes
>
> > with different logical block sizes, or report the per-namespace values
> > only for some namespaces.
> >
> > It also fixes a missing queue_limits_cancel_update in an error path by
> > removing that error path.
> >
> > Fixes: 8695f060a029 ("nvme: all namespaces in a subsystem must adhere to a common atomic write size")
> > Reported-by: Yi Zhang <yi.zhang at redhat.com>
>
> I couldn't find the report on linux-nvme; if it is public, can you
> include a "Closes:" link here?
Yes, it was reported here:
https://lore.kernel.org/linux-nvme/CAHj4cs93WutyoPLFMr0JidHhRAxHC1ZDcj-RvdnX=R7OaV5ejg@mail.gmail.com/
>
> > Signed-off-by: Christoph Hellwig <hch at lst.de>
> > Reviewed-by: Luis Chamberlain <mcgrof at kernel.org>
> > Reviewed-by: John Garry <john.g.garry at oracle.com>
> > Tested-by: Yi Zhang <yi.zhang at redhat.com>
>
> I also saw a problem in a system running on 6.16-rc3 with several NVMe
> subsystems, each containing one controller with AWUPF=0, each containing
> two namespaces with NSABP=0. One namespace has a 512-byte LBA size while
> the other has a 4096-byte LBA size, and some namespaces failed to add:
>
> # dmesg | grep Inconsistent
> [ 5.537494] nvme nvme4: nvme4n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 5.539038] nvme nvme6: nvme6n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=4096 bytes, Controller/Namespace=512 bytes
> [ 5.560079] nvme nvme7: nvme7n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=4096 bytes, Controller/Namespace=512 bytes
> [ 5.595093] nvme nvme3: nvme3n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=4096 bytes, Controller/Namespace=512 bytes
> [ 5.597627] nvme nvme8: nvme8n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 5.600007] nvme nvme0: nvme0n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 5.605748] nvme nvme5: nvme5n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 5.608961] nvme nvme11: nvme11n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=4096 bytes, Controller/Namespace=512 bytes
> [ 5.618011] nvme nvme12: nvme12n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=4096 bytes, Controller/Namespace=512 bytes
> [ 5.618251] nvme nvme10: nvme10n2: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=4096 bytes, Controller/Namespace=512 bytes
>
> Note that despite the messages saying otherwise, only those log lines
> containing "Subsystem=512 bytes, Controller/Namespace=4096 bytes"
> actually failed to add a namespace - the others had both namespaces
> added just fine. I guess it isn't deterministic as to which namespace
> was added first.
>
> Anyways, the problem was fixed by this patch set.
>
> Tested-by: Uday Shankar <ushankar at purestorage.com>
>
> > ---
> > drivers/nvme/host/core.c | 33 +++++++++++----------------------
> > drivers/nvme/host/nvme.h | 3 +--
> > 2 files changed, 12 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 520fb5f1e214..e533d791955d 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -2041,17 +2041,7 @@ static u32 nvme_configure_atomic_write(struct nvme_ns *ns,
> > * no clear language in the specification prohibiting different
> > * values for different controllers in the subsystem.
> > */
> > - atomic_bs = (1 + ns->ctrl->awupf) * bs;
> > - }
> > -
> > - if (!ns->ctrl->subsys->atomic_bs) {
> > - ns->ctrl->subsys->atomic_bs = atomic_bs;
> > - } else if (ns->ctrl->subsys->atomic_bs != atomic_bs) {
> > - dev_err_ratelimited(ns->ctrl->device,
> > - "%s: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=%d bytes, Controller/Namespace=%d bytes\n",
> > - ns->disk ? ns->disk->disk_name : "?",
> > - ns->ctrl->subsys->atomic_bs,
> > - atomic_bs);
> > + atomic_bs = (1 + ns->ctrl->subsys->awupf) * bs;
> > }
> >
> > lim->atomic_write_hw_max = atomic_bs;
> > @@ -2386,16 +2376,6 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
> > if (!nvme_update_disk_info(ns, id, &lim))
> > capacity = 0;
> >
> > - /*
> > - * Validate the max atomic write size fits within the subsystem's
> > - * atomic write capabilities.
> > - */
> > - if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
> > - blk_mq_unfreeze_queue(ns->disk->queue, memflags);
> > - ret = -ENXIO;
> > - goto out;
> > - }
> > -
> > nvme_config_discard(ns, &lim);
> > if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
> > ns->head->ids.csi == NVME_CSI_ZNS)
> > @@ -3219,6 +3199,7 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
> > memcpy(subsys->model, id->mn, sizeof(subsys->model));
> > subsys->vendor_id = le16_to_cpu(id->vid);
> > subsys->cmic = id->cmic;
> > + subsys->awupf = le16_to_cpu(id->awupf);
> >
> > /* Versions prior to 1.4 don't necessarily report a valid type */
> > if (id->cntrltype == NVME_CTRL_DISC ||
> > @@ -3556,6 +3537,15 @@ static int nvme_init_identify(struct nvme_ctrl *ctrl)
> > if (ret)
> > goto out_free;
> > }
> > +
> > + if (le16_to_cpu(id->awupf) != ctrl->subsys->awupf) {
> > + dev_err_ratelimited(ctrl->device,
> > + "inconsistent AWUPF, controller not added (%u/%u).\n",
> > + le16_to_cpu(id->awupf), ctrl->subsys->awupf);
> > + ret = -EINVAL;
> > + goto out_free;
> > + }
> > +
>
> Could you explain (and perhaps add a comment here) why all controllers
> in a subsystem must report the same awupf? Is it because namespaces may
> inherit awupf from the controller, and may be reachable through multiple
> controllers for multipath?
>
> > memcpy(ctrl->subsys->firmware_rev, id->fr,
> > sizeof(ctrl->subsys->firmware_rev));
> >
> > @@ -3651,7 +3641,6 @@ static int nvme_init_identify(struct nvme_ctrl *ctrl)
> > dev_pm_qos_expose_latency_tolerance(ctrl->device);
> > else if (!ctrl->apst_enabled && prev_apst_enabled)
> > dev_pm_qos_hide_latency_tolerance(ctrl->device);
> > - ctrl->awupf = le16_to_cpu(id->awupf);
> > out_free:
> > kfree(id);
> > return ret;
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index a468cdc5b5cb..7df2ea21851f 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -410,7 +410,6 @@ struct nvme_ctrl {
> >
> > enum nvme_ctrl_type cntrltype;
> > enum nvme_dctype dctype;
> > - u16 awupf; /* 0's based value. */
> > };
> >
> > static inline enum nvme_ctrl_state nvme_ctrl_state(struct nvme_ctrl *ctrl)
> > @@ -443,11 +442,11 @@ struct nvme_subsystem {
> > u8 cmic;
> > enum nvme_subsys_type subtype;
> > u16 vendor_id;
> > + u16 awupf; /* 0's based value. */
> > struct ida ns_ida;
> > #ifdef CONFIG_NVME_MULTIPATH
> > enum nvme_iopolicy iopolicy;
> > #endif
> > - u32 atomic_bs;
> > };
> >
> > /*
> > --
> > 2.47.2
> >
> >
>
--
Best Regards,
Yi Zhang
More information about the Linux-nvme
mailing list