[PATCH 2/6] block: add support for selecting all zones

Javier González javier at javigon.com
Fri Jun 26 01:58:52 EDT 2020


On 26.06.2020 01:27, Damien Le Moal wrote:
>On 2020/06/25 21:22, Javier González wrote:
>> From: Javier González <javier.gonz at samsung.com>
>>
>> Add flag to allow selecting all zones on a single zone management
>> operation
>>
>> Signed-off-by: Javier González <javier.gonz at samsung.com>
>> Signed-off-by: SelvaKumar S <selvakuma.s1 at samsung.com>
>> Signed-off-by: Kanchan Joshi <joshi.k at samsung.com>
>> Signed-off-by: Nitesh Shetty <nj.shetty at samsung.com>
>> ---
>>  block/blk-zoned.c             | 3 +++
>>  include/linux/blk_types.h     | 3 ++-
>>  include/uapi/linux/blkzoned.h | 9 +++++++++
>>  3 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index e87c60004dc5..29194388a1bb 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -420,6 +420,9 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>  		return -ENOTTY;
>>  	}
>>
>> +	if (zmgmt.flags & BLK_ZONE_SELECT_ALL)
>> +		op |= REQ_ZONE_ALL;
>> +
>>  	return blkdev_zone_mgmt(bdev, op, zmgmt.sector, zmgmt.nr_sectors,
>>  				GFP_KERNEL);
>>  }
>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>> index ccb895f911b1..16b57fb2b99c 100644
>> --- a/include/linux/blk_types.h
>> +++ b/include/linux/blk_types.h
>> @@ -351,6 +351,7 @@ enum req_flag_bits {
>>  	 * work item to avoid such priority inversions.
>>  	 */
>>  	__REQ_CGROUP_PUNT,
>> +	__REQ_ZONE_ALL,		/* apply zone operation to all zones */
>>
>>  	/* command specific flags for REQ_OP_WRITE_ZEROES: */
>>  	__REQ_NOUNMAP,		/* do not free blocks when zeroing */
>> @@ -378,7 +379,7 @@ enum req_flag_bits {
>>  #define REQ_BACKGROUND		(1ULL << __REQ_BACKGROUND)
>>  #define REQ_NOWAIT		(1ULL << __REQ_NOWAIT)
>>  #define REQ_CGROUP_PUNT		(1ULL << __REQ_CGROUP_PUNT)
>> -
>> +#define REQ_ZONE_ALL		(1ULL << __REQ_ZONE_ALL)
>>  #define REQ_NOUNMAP		(1ULL << __REQ_NOUNMAP)
>>  #define REQ_HIPRI		(1ULL << __REQ_HIPRI)
>>
>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>> index 07b5fde21d9f..a8c89fe58f97 100644
>> --- a/include/uapi/linux/blkzoned.h
>> +++ b/include/uapi/linux/blkzoned.h
>> @@ -157,6 +157,15 @@ enum blk_zone_action {
>>  	BLK_ZONE_MGMT_RESET	= 0x4,
>>  };
>>
>> +/**
>> + * enum blk_zone_mgmt_flags - Flags for blk_zone_mgmt
>> + *
>> + * BLK_ZONE_SELECT_ALL: Select all zones for current zone action
>> + */
>> +enum blk_zone_mgmt_flags {
>> +	BLK_ZONE_SELECT_ALL	= 1 << 0,
>> +};
>> +
>>  /**
>>   * struct blk_zone_mgmt - Extended zoned management
>>   *
>>
>
>NACK.
>
>Details:
>1) REQ_OP_ZONE_RESET together with REQ_ZONE_ALL is the same as
>REQ_OP_ZONE_RESET_ALL, isn't it ? You are duplicating a functionality that
>already exists.
>2) The patch introduces REQ_ZONE_ALL at the block layer only without defining
>how it ties into SCSI and NVMe driver use of it. Is REQ_ZONE_ALL indicating that
>the zone management commands are to be executed with the ALL bit set ? If yes,
>that will break device-mapper. See the special code for handling
>REQ_OP_ZONE_RESET_ALL. That code is in place for a reason: the target block
>device may not be an entire physical device. In that case, applying a zone
>management command to all zones of the physical drive is wrong.
>3) REQ_ZONE_ALL seems completely equivalent to specifying a sector range of [0
>.. drive capacity]. So what is the point ? The current interface handles that.
>That is how we chose between REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL right now.
>4) Without any in-kernel user, I do not see the point. And for applications, I
>do not see any good use case for doing open all, close all, offline all or
>finish all. If you have any such good use case, please elaborate.
>

The main use if reset all, but without having to look through all zones,
as it imposes an overhead when we have a large number of zones. Having
the possibility to offload it to HW is more efficient.

I had not thought about the device mapper use case. Would it be an
option to translate this into REQ_OP_ZONE_RESET_ALL when we have a
device mapper (or any other case where this might break) and then leave
the bit go to the driver if it applies to the whole device?

Javier



More information about the Linux-nvme mailing list