[RFC PATCH 1/8] block: add support for REQ_OP_VERIFY
Chaitanya Kulkarni
chaitanyak at nvidia.com
Thu Nov 11 00:01:07 PST 2021
On 11/4/2021 10:25 AM, Darrick J. Wong wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, Nov 03, 2021 at 11:46:27PM -0700, Chaitanya Kulkarni wrote:
>> From: Chaitanya Kulkarni <kch at nvidia.com>
>>
>> This adds a new block layer operation to offload verifying a range of
>> LBAs. This support is needed in order to provide file systems and
>> fabrics, kernel components to offload LBA verification when it is
>> supported by the hardware controller. In case hardware offloading is
>> not supported then we provide APIs to emulate the same. The prominent
>> example of that is NVMe Verify command. We also provide an emulation of
>> the same operation which can be used in case H/W does not support
>> verify. This is still useful when block device is remotely attached e.g.
>> using NVMeOF.
>>
>> Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
>> ---
>> Documentation/ABI/testing/sysfs-block | 14 ++
>> block/blk-core.c | 5 +
>> block/blk-lib.c | 192 ++++++++++++++++++++++++++
>> block/blk-merge.c | 19 +++
>> block/blk-settings.c | 17 +++
>> block/blk-sysfs.c | 8 ++
>> block/blk-zoned.c | 1 +
>> block/bounce.c | 1 +
>> block/ioctl.c | 35 +++++
>> include/linux/bio.h | 10 +-
>> include/linux/blk_types.h | 2 +
>> include/linux/blkdev.h | 31 +++++
>> include/uapi/linux/fs.h | 1 +
>> 13 files changed, 332 insertions(+), 4 deletions(-)
>>
>
> (skipping to the ioctl part; I didn't see anything obviously weird in
> the block/ changes)
>
Yes it is pretty straight forward.
>> diff --git a/block/ioctl.c b/block/ioctl.c
>> index d61d652078f4..5e1b3c4660bf 100644
>> --- a/block/ioctl.c
>> +++ b/block/ioctl.c
>> @@ -168,6 +168,39 @@ static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode,
>> BLKDEV_ZERO_NOUNMAP);
>> }
>>
>> +static int blk_ioctl_verify(struct block_device *bdev, fmode_t mode,
>> + unsigned long arg)
>> +{
>> + uint64_t range[2];
>> + struct address_space *mapping;
>> + uint64_t start, end, len;
>> +
>> + if (!(mode & FMODE_WRITE))
>> + return -EBADF;
>
> Why does the fd have to be opened writable? Isn't this a read test?
>
Yes this needs to be removed, will fix it in the V1.
>> +
>> + if (copy_from_user(range, (void __user *)arg, sizeof(range)))
>> + return -EFAULT;
>> +
>> + start = range[0];
>> + len = range[1];
>> + end = start + len - 1;
>> +
>> + if (start & 511)
>> + return -EINVAL;
>> + if (len & 511)
>> + return -EINVAL;
>> + if (end >= (uint64_t)i_size_read(bdev->bd_inode))
>> + return -EINVAL;
>> + if (end < start)
>> + return -EINVAL;
>> +
>> + /* Invalidate the page cache, including dirty pages */
>> + mapping = bdev->bd_inode->i_mapping;
>> + truncate_inode_pages_range(mapping, start, end);
>
> Why do we need to invalidate the page cache to verify media? Won't that
> cause data loss if those pages were dirty and about to be flushed?
>
> --D
>
Yes, will fix it in the v1.
>> +
>> + return blkdev_issue_verify(bdev, start >> 9, len >> 9, GFP_KERNEL);
>> +}
>> +
Thanks a lot Derrik for your comments, I'll add fixes to V1.
More information about the Linux-nvme
mailing list