[PATCH 3/6] block: add support for zone offline transition
Javier González
javier at javigon.com
Fri Jun 26 02:58:22 EDT 2020
On 26.06.2020 06:42, Damien Le Moal wrote:
>On 2020/06/26 15:09, Javier González wrote:
>> On 26.06.2020 01:34, Damien Le Moal wrote:
>>> On 2020/06/25 21:22, Javier González wrote:
>>>> From: Javier González <javier.gonz at samsung.com>
>>>>
>>>> Add support for offline transition on the zoned block device using the
>>>> new zone management IOCTL
>>>>
>>>> Signed-off-by: Javier González <javier.gonz at samsung.com>
>>>> Signed-off-by: SelvaKumar S <selvakuma.s1 at samsung.com>
>>>> Signed-off-by: Kanchan Joshi <joshi.k at samsung.com>
>>>> Signed-off-by: Nitesh Shetty <nj.shetty at samsung.com>
>>>> ---
>>>> block/blk-core.c | 2 ++
>>>> block/blk-zoned.c | 3 +++
>>>> drivers/nvme/host/core.c | 3 +++
>>>> include/linux/blk_types.h | 3 +++
>>>> include/linux/blkdev.h | 1 -
>>>> include/uapi/linux/blkzoned.h | 1 +
>>>> 6 files changed, 12 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 03252af8c82c..589cbdacc5ec 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -140,6 +140,7 @@ static const char *const blk_op_name[] = {
>>>> REQ_OP_NAME(ZONE_CLOSE),
>>>> REQ_OP_NAME(ZONE_FINISH),
>>>> REQ_OP_NAME(ZONE_APPEND),
>>>> + REQ_OP_NAME(ZONE_OFFLINE),
>>>> REQ_OP_NAME(WRITE_SAME),
>>>> REQ_OP_NAME(WRITE_ZEROES),
>>>> REQ_OP_NAME(SCSI_IN),
>>>> @@ -1030,6 +1031,7 @@ generic_make_request_checks(struct bio *bio)
>>>> case REQ_OP_ZONE_OPEN:
>>>> case REQ_OP_ZONE_CLOSE:
>>>> case REQ_OP_ZONE_FINISH:
>>>> + case REQ_OP_ZONE_OFFLINE:
>>>> if (!blk_queue_is_zoned(q))
>>>> goto not_supported;
>>>> break;
>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>> index 29194388a1bb..704fc15813d1 100644
>>>> --- a/block/blk-zoned.c
>>>> +++ b/block/blk-zoned.c
>>>> @@ -416,6 +416,9 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>> case BLK_ZONE_MGMT_RESET:
>>>> op = REQ_OP_ZONE_RESET;
>>>> break;
>>>> + case BLK_ZONE_MGMT_OFFLINE:
>>>> + op = REQ_OP_ZONE_OFFLINE;
>>>> + break;
>>>> default:
>>>> return -ENOTTY;
>>>> }
>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>> index f1215523792b..5b95c81d2a2d 100644
>>>> --- a/drivers/nvme/host/core.c
>>>> +++ b/drivers/nvme/host/core.c
>>>> @@ -776,6 +776,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
>>>> case REQ_OP_ZONE_FINISH:
>>>> ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_FINISH);
>>>> break;
>>>> + case REQ_OP_ZONE_OFFLINE:
>>>> + ret = nvme_setup_zone_mgmt_send(ns, req, cmd, NVME_ZONE_OFFLINE);
>>>> + break;
>>>> case REQ_OP_WRITE_ZEROES:
>>>> ret = nvme_setup_write_zeroes(ns, req, cmd);
>>>> break;
>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>> index 16b57fb2b99c..b3921263c3dd 100644
>>>> --- a/include/linux/blk_types.h
>>>> +++ b/include/linux/blk_types.h
>>>> @@ -316,6 +316,8 @@ enum req_opf {
>>>> REQ_OP_ZONE_FINISH = 12,
>>>> /* write data at the current zone write pointer */
>>>> REQ_OP_ZONE_APPEND = 13,
>>>> + /* Transition a zone to offline */
>>>> + REQ_OP_ZONE_OFFLINE = 14,
>>>>
>>>> /* SCSI passthrough using struct scsi_request */
>>>> REQ_OP_SCSI_IN = 32,
>>>> @@ -456,6 +458,7 @@ static inline bool op_is_zone_mgmt(enum req_opf op)
>>>> case REQ_OP_ZONE_OPEN:
>>>> case REQ_OP_ZONE_CLOSE:
>>>> case REQ_OP_ZONE_FINISH:
>>>> + case REQ_OP_ZONE_OFFLINE:
>>>> return true;
>>>> default:
>>>> return false;
>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>>> index bd8521f94dc4..8308d8a3720b 100644
>>>> --- a/include/linux/blkdev.h
>>>> +++ b/include/linux/blkdev.h
>>>> @@ -372,7 +372,6 @@ extern int blkdev_zone_ops_ioctl(struct block_device *bdev, fmode_t mode,
>>>> unsigned int cmd, unsigned long arg);
>>>> extern int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
>>>> unsigned int cmd, unsigned long arg);
>>>> -
>>>> #else /* CONFIG_BLK_DEV_ZONED */
>>>>
>>>> static inline unsigned int blkdev_nr_zones(struct gendisk *disk)
>>>> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
>>>> index a8c89fe58f97..d0978ee10fc7 100644
>>>> --- a/include/uapi/linux/blkzoned.h
>>>> +++ b/include/uapi/linux/blkzoned.h
>>>> @@ -155,6 +155,7 @@ enum blk_zone_action {
>>>> BLK_ZONE_MGMT_FINISH = 0x2,
>>>> BLK_ZONE_MGMT_OPEN = 0x3,
>>>> BLK_ZONE_MGMT_RESET = 0x4,
>>>> + BLK_ZONE_MGMT_OFFLINE = 0x5,
>>>> };
>>>>
>>>> /**
>>>>
>>>
>>> As mentioned in previous email, the usefulness of this is dubious. Please
>>> elaborate in the commit message. Sure NVMe ZNS defines this and we can support
>>> it. But without a good use case, what is the point ?
>>
>> Use case is to transition zones in read-only state to offline when we
>> are done moving valid data. It is easier to explicitly managing zones
>> that are not usable by having all under the offline state.
>
>Then adding a simple BLKZONEOFFLINE ioctl, similar to open, close, finish and
>reset, would be enough. No need for all the new zone management ioctl with flags
>plumbing.
Ok. We can add that then.
Note that zone management is not motivated by this use case at all, but
it made sense to implement it here instead of as a new BLKZONEOFFLINE
IOCTL as ZAC/ZBC users will not be able to use it either way.
>
>>
>>>
>>> scsi SD driver will return BLK_STS_NOTSUPP if this offlining is sent to a
>>> ZBC/ZAC drive. Not nice. Having a sysfs attribute "max_offline_zone_sectors" or
>>> the like to indicate support by the device or not would be nicer.
>>
>> We can do that.
>>
>>>
>>> Does offling ALL zones make any sense ? Because this patch does not prevent the
>>> use of the REQ_ZONE_ALL flags introduced in patch 2. Probably not a good idea to
>>> allow offlining all zones, no ?
>>
>> AFAIK the transition to offline is only valid when coming from a
>> read-only state. I did think of adding a check, but I can see that other
>> transitions go directly to the driver and then the device, so I decided
>> to follow the same model. If you think it is better, we can add the
>> check.
>
>My point was that the REQ_ZONE_ALL flag would make no sense for offlining zones
>but this patch does not have anything checking that. There is no point in
>sending a command that is known to be incorrect to the drive...
I will add some extra checks then to fail early. I assume these should
be in the NVMe driver as it is NVMe-specific, right?
Javier
More information about the Linux-nvme
mailing list