[PATCH 3/5] nvme: add support for copy offload
Caleb Sander Mateos
csander at purestorage.com
Wed May 21 17:51:03 PDT 2025
On Wed, May 21, 2025 at 5:47 PM Caleb Sander Mateos
<csander at purestorage.com> wrote:
>
> On Wed, May 21, 2025 at 3:31 PM Keith Busch <kbusch at meta.com> wrote:
> >
> > From: Keith Busch <kbusch at kernel.org>
> >
> > Register the nvme namespace copy capablities with the request_queue
>
> nit: "capabilities"
>
> > limits and implement support for the REQ_OP_COPY operation.
> >
> > Signed-off-by: Keith Busch <kbusch at kernel.org>
> > ---
> > drivers/nvme/host/core.c | 61 ++++++++++++++++++++++++++++++++++++++++
> > include/linux/nvme.h | 42 ++++++++++++++++++++++++++-
> > 2 files changed, 102 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index f69a232a000ac..3134fe85b1abc 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -888,6 +888,52 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
> > return BLK_STS_OK;
> > }
> >
> > +static inline blk_status_t nvme_setup_copy(struct nvme_ns *ns,
> > + struct request *req, struct nvme_command *cmnd)
> > +{
> > + struct nvme_copy_range *range;
> > + struct req_iterator iter;
> > + struct bio_vec bvec;
> > + u16 control = 0;
> > + int i = 0;
>
> Make this unsigned to avoid sign extension when used as an index?
>
> > +
> > + static const size_t alloc_size = sizeof(*range) * NVME_COPY_MAX_RANGES;
> > +
> > + if (WARN_ON_ONCE(blk_rq_nr_phys_segments(req) >= NVME_COPY_MAX_RANGES))
>
> Should be > instead of >=?
>
> > + return BLK_STS_IOERR;
> > +
> > + range = kzalloc(alloc_size, GFP_ATOMIC | __GFP_NOWARN);
> > + if (!range)
> > + return BLK_STS_RESOURCE;
> > +
> > + if (req->cmd_flags & REQ_FUA)
> > + control |= NVME_RW_FUA;
> > + if (req->cmd_flags & REQ_FAILFAST_DEV)
> > + control |= NVME_RW_LR;
> > +
> > + rq_for_each_copy_bvec(bvec, req, iter) {
> > + u64 slba = nvme_sect_to_lba(ns->head, bvec.bv_sector);
> > + u64 nlb = nvme_sect_to_lba(ns->head, bvec.bv_sectors) - 1;
> > +
> > + range[i].slba = cpu_to_le64(slba);
> > + range[i].nlb = cpu_to_le16(nlb);
> > + i++;
> > + }
> > +
> > + memset(cmnd, 0, sizeof(*cmnd));
> > + cmnd->copy.opcode = nvme_cmd_copy;
> > + cmnd->copy.nsid = cpu_to_le32(ns->head->ns_id);
> > + cmnd->copy.nr_range = i - 1;
> > + cmnd->copy.sdlba = cpu_to_le64(nvme_sect_to_lba(ns->head,
> > + blk_rq_pos(req)));
> > + cmnd->copy.control = cpu_to_le16(control);
> > +
> > + bvec_set_virt(&req->special_vec, range, alloc_size);
>
> alloc_size should be sizeof(*range) * i? Otherwise this exceeds the
> amount of data used by the Copy command, which not all controllers
> support (see bit LLDTS of SGLS in the Identify Controller data
> structure). We have seen the same behavior with Dataset Management
> (always specifying 4 KB of data), which also passes the maximum size
> of the allocation to bvec_set_virt().
I see that was added in commit 530436c45ef2e ("nvme: Discard
workaround for non-conformant devices"). I would rather wait for
evidence of non-conformant devices supporting Copy before implementing
the same spec-noncompliant workaround. It could be a quirk if
necessary.
Best,
Caleb
More information about the Linux-nvme
mailing list