[PATCH v4 02/10] block: Add copy offload support infrastructure

Nitesh Shetty nj.shetty at samsung.com
Thu Apr 28 01:01:05 PDT 2022


On Thu, Apr 28, 2022 at 07:04:13AM +0900, Damien Le Moal wrote:
> On 4/28/22 00:15, Nitesh Shetty wrote:
> > On Wed, Apr 27, 2022 at 11:45:26AM +0900, Damien Le Moal wrote:
> >> On 4/26/22 19:12, Nitesh Shetty wrote:
> >>> Introduce blkdev_issue_copy which supports source and destination bdevs,
> >>> and an array of (source, destination and copy length) tuples.
> >>> Introduce REQ_COPY copy offload operation flag. Create a read-write
> >>> bio pair with a token as payload and submitted to the device in order.
> >>> Read request populates token with source specific information which
> >>> is then passed with write request.
> >>> This design is courtesy Mikulas Patocka's token based copy
> >>>
> >>> Larger copy will be divided, based on max_copy_sectors,
> >>> max_copy_range_sector limits.
> >>>
> >>> Signed-off-by: Nitesh Shetty <nj.shetty at samsung.com>
> >>> Signed-off-by: Arnav Dawn <arnav.dawn at samsung.com>
> >>> ---
> >>>  block/blk-lib.c           | 232 ++++++++++++++++++++++++++++++++++++++
> >>>  block/blk.h               |   2 +
> >>>  include/linux/blk_types.h |  21 ++++
> >>>  include/linux/blkdev.h    |   2 +
> >>>  include/uapi/linux/fs.h   |  14 +++
> >>>  5 files changed, 271 insertions(+)
> >>>
> >>> diff --git a/block/blk-lib.c b/block/blk-lib.c
> >>> index 09b7e1200c0f..ba9da2d2f429 100644
> >>> --- a/block/blk-lib.c
> >>> +++ b/block/blk-lib.c
> >>> @@ -117,6 +117,238 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >>>  }
> >>>  EXPORT_SYMBOL(blkdev_issue_discard);
> >>>  
> >>> +/*
> >>> + * Wait on and process all in-flight BIOs.  This must only be called once
> >>> + * all bios have been issued so that the refcount can only decrease.
> >>> + * This just waits for all bios to make it through bio_copy_end_io. IO
> >>> + * errors are propagated through cio->io_error.
> >>> + */
> >>> +static int cio_await_completion(struct cio *cio)
> >>> +{
> >>> +	int ret = 0;
> >>> +	unsigned long flags;
> >>> +
> >>> +	spin_lock_irqsave(&cio->lock, flags);
> >>> +	if (cio->refcount) {
> >>> +		cio->waiter = current;
> >>> +		__set_current_state(TASK_UNINTERRUPTIBLE);
> >>> +		spin_unlock_irqrestore(&cio->lock, flags);
> >>> +		blk_io_schedule();
> >>> +		/* wake up sets us TASK_RUNNING */
> >>> +		spin_lock_irqsave(&cio->lock, flags);
> >>> +		cio->waiter = NULL;
> >>> +		ret = cio->io_err;
> >>> +	}
> >>> +	spin_unlock_irqrestore(&cio->lock, flags);
> >>> +	kvfree(cio);
> >>
> >> cio is allocated with kzalloc() == kmalloc(). So why the kvfree() here ?
> >>
> > 
> > acked.
> > 
> >>> +
> >>> +	return ret;
> >>> +}
> >>> +
> >>> +static void bio_copy_end_io(struct bio *bio)
> >>> +{
> >>> +	struct copy_ctx *ctx = bio->bi_private;
> >>> +	struct cio *cio = ctx->cio;
> >>> +	sector_t clen;
> >>> +	int ri = ctx->range_idx;
> >>> +	unsigned long flags;
> >>> +	bool wake = false;
> >>> +
> >>> +	if (bio->bi_status) {
> >>> +		cio->io_err = bio->bi_status;
> >>> +		clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - ctx->start_sec;
> >>> +		cio->rlist[ri].comp_len = min_t(sector_t, clen, cio->rlist[ri].comp_len);
> >>
> >> long line.
> > 
> > Is it because line is more than 80 character, I thought limit is 100 now, so
> > went with longer lines ?
> 
> When it is easy to wrap the lines without readability loss, please do to
> keep things under 80 char per line.
> 
>

acked

> >>> +{
> >>> +	struct request_queue *src_q = bdev_get_queue(src_bdev);
> >>> +	struct request_queue *dest_q = bdev_get_queue(dest_bdev);
> >>> +	int ret = -EINVAL;
> >>> +
> >>> +	if (!src_q || !dest_q)
> >>> +		return -ENXIO;
> >>> +
> >>> +	if (!nr)
> >>> +		return -EINVAL;
> >>> +
> >>> +	if (nr >= MAX_COPY_NR_RANGE)
> >>> +		return -EINVAL;
> >>
> >> Where do you check the number of ranges against what the device can do ?
> >>
> > 
> > The present implementation submits only one range at a time. This was done to 
> > make copy offload generic, so that other types of copy implementation such as
> > XCOPY should be able to use same infrastructure. Downside at present being
> > NVMe copy offload is not optimal.
> 
> If you issue one range at a time without checking the number of ranges,
> what is the point of the nr ranges queue limit ? The user can submit a
> copy ioctl request exceeding it. Please use that limit and enforce it or
> remove it entirely.
> 

Sure, will remove this limit in next version.

--
Thank you
Nitesh Shetty



More information about the Linux-nvme mailing list