[PATCH 1/4] block: Add zone flags to queue zone prop.

Javier González javier at javigon.com
Fri Jul 3 02:28:15 EDT 2020


On 03.07.2020 05:20, Damien Le Moal wrote:
>On 2020/07/02 19:27, Javier González wrote:
>> On 02.07.2020 08:49, Damien Le Moal wrote:
>>> On 2020/07/02 17:34, Javier González wrote:
>>>> On 02.07.2020 07:54, Damien Le Moal wrote:
>>>>> On 2020/07/02 15:55, Javier González wrote:
>>>>>> From: Javier González <javier.gonz at samsung.com>
>>>>>>
>>>>>> As the zoned block device will have to deal with features that are
>>>>>> optional for the backend device, add a flag field to inform the block
>>>>>> layer about supported features. This builds on top of
>>>>>> blk_zone_report_flags and extendes to the zone report code.
>>>>>>
>>>>>> Signed-off-by: Javier González <javier.gonz at samsung.com>
>>>>>> Signed-off-by: SelvaKumar S <selvakuma.s1 at samsung.com>
>>>>>> Signed-off-by: Kanchan Joshi <joshi.k at samsung.com>
>>>>>> Signed-off-by: Nitesh Shetty <nj.shetty at samsung.com>
>>>>>> ---
>>>>>>  block/blk-zoned.c              | 3 ++-
>>>>>>  drivers/block/null_blk_zoned.c | 2 ++
>>>>>>  drivers/nvme/host/zns.c        | 1 +
>>>>>>  drivers/scsi/sd.c              | 2 ++
>>>>>>  include/linux/blkdev.h         | 3 +++
>>>>>>  5 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>>>> index 81152a260354..0f156e96e48f 100644
>>>>>> --- a/block/blk-zoned.c
>>>>>> +++ b/block/blk-zoned.c
>>>>>> @@ -312,7 +312,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
>>>>>>  		return ret;
>>>>>>
>>>>>>  	rep.nr_zones = ret;
>>>>>> -	rep.flags = BLK_ZONE_REP_CAPACITY;
>>>>>> +	rep.flags = q->zone_flags;
>>>>>
>>>>> zone_flags seem to be a fairly generic flags field while rep.flags is only about
>>>>> the reported descriptors structure. So you may want to define a
>>>>> BLK_ZONE_REP_FLAGS_MASK and have:
>>>>>
>>>>> 	rep.flags = q->zone_flags & BLK_ZONE_REP_FLAGS_MASK;
>>>>>
>>>>> to avoid flags meaningless for the user being set.
>>>>>
>>>>> In any case, since *all* zoned block devices now report capacity, I do not
>>>>> really see the point to add BLK_ZONE_REP_FLAGS_MASK to q->zone_flags, especially
>>>>> considering that you set the flag for all zoned device types, including scsi
>>>>> which does not have zone capacity. This makes q->zone_flags rather confusing
>>>>> instead of clearly defining the device features as you mentioned in the commit
>>>>> message.
>>>>>
>>>>> I think it may be better to just drop this patch, and if needed, introduce the
>>>>> zone_flags field where it may be needed (e.g. OFFLINE zone ioctl support).
>>>>
>>>> I am using this as a way to pass the OFFLINE support flag to the block
>>>> layer. I used this too for the attributes. Are you thinking of a
>>>> different way to send this?
>>>>
>>>> I believe this fits too for capacity, but we can just pass it in
>>>> all report as before if you prefer.
>>>
>>> The point is that this patch as is does nothing and is needed as a preparatory
>>> patch if we want to have the offline flag set in the report. But:
>>> 1) As commented in the offline ioctl patch, I am not sure the flag makes a lot
>>> of sense. sysfs or nothing at all may be OK as well. When we introduced the new
>>> open/close/finish ioctls, we did not add flags to signal that the device
>>> supports them. Granted, it was for nullblk and scsi, and both had the support.
>>> But running an application using these on an old kernel, and you will get
>>> -ENOTTY, meaning, not supported. So simply introducing the offline ioctl without
>>> any flag would be OK I think.
>>
>> I see. My understanding after some comments from Christoph was that we
>> should use these bits to signal any optional features / capabilities
>> that would depend on the underlying driver, just as it is done with the
>> capacity flag today.
>>
>> If not for the offline transition, for the attributes, I see it exactly
>> as the same use case as capacity, where we signal that a new field is
>> reported in the report structure.
>>
>> Am I missing something here?
>
>Using the report flags for reporting supported operations/features is fine, but
>as already mentioned, then document that by changing the description of enum
>blk_zone_report_flags. Right now, it still says:
>
> * enum blk_zone_report_flags - Feature flags of reported zone descriptors.
>
>Which kind of really point to things the zone descriptor itself has compared to
>earlier versions of the descriptor rather than what the device can do or not.

I see how the offline transition collides with this model. But can we
agree that the zone attributes is something that fits here?

>
>And as I argued already, using a flag is fine only for letting a user know about
>a supported feature, but that is far from ideal to have device-mapper easily
>discover what a target can support or not. For that, stacked queue limits are
>much simpler. Which leads to exporting that limit in sysfs rather than using a
>flag for the user interface.

See below

>
>>> Since DM support for this would be nice too and we now are in a situation where
>>> not all devices support the  ioctl, instead of a report flag (a little out of
>>> place), a queue limit exported through sysfs may be a cleaner way to both
>>> support DM correctly (stacked limits) and signal the support to the user. If you
>>> choose this approach, then this patch is not needed. The zoned_flags, or a
>>> regular queue flag like for RESET_ALL can be added in the offline ioctl patch.
>>
>> I see. If we can reach an agreement that the above is a bad
>> understanding on my side, I will be happy to translate this into a sysfs
>> entry. If it is OK, I'll give it this week in the mailing list and send
>> a V4 next week.
>
>It is all about device mapper support... How are you planning to do it using a
>queue flag without said flags being stackable as queue limits ?

I guess what I am struggle with here is that you see the capacity
feature as a different extension than the attributes and the offline
transition.

The way I see it, there are things the device supports and things that
the driver supports. ZAC/ZBC does not support a size != capacity, but
the driver sets these values to be the same. One thing I struggle with
is that saying that we support capacity and setting size = capacity puts
the burden in the user to check these values across all zones to
determine if they can support the driver or not. I think it would be
clear to have a feature explicitly stating that capacity != size.

For the attributes, this is very similar - some apply, some don't, and
we only report the ones that make sense for each driver. I do not see
how device mappers are treated differently here than in the existing
capacity features.

For the offline transition, I see that the current patches do not set
flags properly and we would need to inherit the underlying device if we
want to do it this way. I can understand from your comments that this is
not a good way of doing it. I see in one of your comments from V1 that
we could deal with this transition in the scsi driver and keep it in
memory (i.e., device is still read-only, but driver transitions to
offline) - can you comment on this?

Javier



More information about the Linux-nvme mailing list