[PATCH V12 2/3] nvmet: add ZBD over ZNS backend support

Chaitanya Kulkarni Chaitanya.Kulkarni at wdc.com
Mon Mar 15 03:54:29 GMT 2021


Christoph/Damien,

On 3/11/21 23:26, Damien Le Moal wrote:
>>>> +void nvmet_bdev_execute_zone_mgmt_recv(struct nvmet_req *req)
>>>> +{
>>>> +	sector_t sect = nvmet_lba_to_sect(req->ns, req->cmd->zmr.slba);
>>>> +	u32 bufsize = (le32_to_cpu(req->cmd->zmr.numd) + 1) << 2;
>>>> +	struct nvmet_report_zone_data data = { .ns = req->ns };
>>>> +	unsigned int nr_zones;
>>>> +	int reported_zones;
>>>> +	u16 status;
>>>> +
>>>> +	status = nvmet_bdev_zns_checks(req);
>>>> +	if (status)
>>>> +		goto out;
>>>> +
>>>> +	data.rz = __vmalloc(bufsize, GFP_KERNEL | __GFP_NORETRY | __GFP_ZERO);
>>>> +	if (!data.rz) {
>>>> +		status = NVME_SC_INTERNAL;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	nr_zones = (bufsize - sizeof(struct nvme_zone_report)) /
>>>> +			sizeof(struct nvme_zone_descriptor);
>>>> +	if (!nr_zones) {
>>>> +		status = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
>>>> +		goto out_free_report_zones;
>>>> +	}
>>>> +
>>>> +	reported_zones = blkdev_report_zones(req->ns->bdev, sect, nr_zones,
>>>> +					     nvmet_bdev_report_zone_cb, &data);
>>>> +	if (reported_zones < 0) {
>>>> +		status = NVME_SC_INTERNAL;
>>>> +		goto out_free_report_zones;
>>>> +	}
>>> There is a problem here: the code as is ignores the request reporting option
>>> field which can lead to an invalid zone report being returned. I think you need
>>> to modify nvmet_bdev_report_zone_cb() to look at the reporting option field
>>> passed by the initiator and filter the zone report since blkdev_report_zones()
>>> does not handle that argument.
>> The reporting options are set by the host statistically in
>> nvme_ns_report_zones()
>> arefrom:-  nvme_ns_report_zones()
>>          c.zmr.zra = NVME_ZRA_ZONE_REPORT;
>>          c.zmr.zrasf = NVME_ZRASF_ZONE_REPORT_ALL;
>>          c.zmr.pr = NVME_REPORT_ZONE_PARTIAL;
>>
>> All the above values are validated in the nvmet_bdev_zns_checks() helper
>> called from nvmet_bdev_execute_zone_mgmt_recv() before we allocate the
>> report zone buffer.
>>
>> 1. c.zmr.zra indicates the action which Reports zone descriptor entries
>>    through the Report Zones data structure.
>>
>>    We validate this value is been set to NVME_ZRA_ZONE_REPORT in the
>>    nvmet_bdev_zns_chceks(). We are calling report zone after checking
>>    zone receive action it NVME_ZRA_ZONE_REPORT so not filtering is needed
>>    in the nvmet_bdev_report_zone_cb().
>>
>> 2. c.zmr.zrasf indicates the action specific field which is set to
>>    NVME_ZRASF_ZONE_REPORT_ALL.
>>
>>    We validate this value is been set to NVME_ZRASF_ZONE_REPORT_ALL in the
>>    nvmet_bdev_zns_chceks(). Since host wants all the zones we don't need to
>>    filter any zone states in the nvmet_bdev_report_zone_cb().
>>
>> 3. c.zmr.pr is set to NVME_REPORT_ZONE_PARTIAL which value = 1 i.e value in
>>    the Report Zone data structure Number of Zones field indicates the
>> number of
>>    fully transferred zone descriptors in the data buffer, which we set from
>>    return value of the blkdev_report_zones() :-
>>   
>>
>>    reported_zones = blkdev_report_zones(req->ns->bdev, sect, nr_zones,
>> 					     nvmet_bdev_report_zone_cb, &data);
>> <snip>   data.rz->nr_zones = cpu_to_le64(reported_zones);
>>
>>    So no filtering is needed in nvmet_bdev_report_zone_cb() for c.zmr.pr.
>>
>> Can you please explain what filtering is missing in the current code ?
>>
>> Maybe I'm looking into an old spec.
> report zones command has the reporting options (ro) field (bits 15:08 of dword
> 13) where the user can specify the following values:
>
> Value Description
> 0h List all zones.
> 1h List the zones in the ZSE:Empty state.
> 2h List the zones in the ZSIO:Implicitly Opened state.
> 3h List the zones in the ZSEO:Explicitly Opened state.
> 4h List the zones in the ZSC:Closed state.
> 5h List the zones in the ZSF:Full state.
> 6h List the zones in the ZSRO:Read Only state.
> 7h List the zones in the ZSO:Offline state.
>
> to filter the zone report based on zone condition. blkdev_report_zones() will
> always to a "list all zones", that is, ro == 0h.
>
> But on the initiator side, if the client issue a report zones command through an
> ioctl (passthrough/direct access not suing the block layer BLKREPORTZONES
> ioctl), it may specify a different value for the ro field. Processing that
> command using blkdev_report_zones() like you are doing here without any
> additional filtering will give an incorrect report. Filtering based on the user
> specified ro field needs to be added in nvmet_bdev_report_zone_cb().
>
> The current code here is fine of the initiator/client side uses the block layer
> and execute all report zones through blkdev_report_zones(). But things will
> break if the client starts doing passthrough commands using nvme ioctl. No ?
>

Regarding the passthru commands support following are the non technical
issues
I ran into, it will be great to get some feedback before we proceed :-

1. We don't support any passthru commands for the generic NVMeOF target
backends
   (file/bdev) for that we have a dedicated passthru target. ZBD backend
is a
   generic one, so do we really want to add support for user's NVMe
passthru
   commands ?

2. In past we've decided that for NVMeOF target non I/O commands should
be handled
   in the userspace applications to keep the target light.

I couldn't come up with justification :(.

There are other features that we may have to implement apart from filtering
if we want to support NVMe passthru commands e.g. Set Zone Descriptor
Extension,
Changed Zone List, AEN Zone Descriptor Changed Notices (Log Identifier
BFh) etc.
(these are just placeholder examples, please do not take it literally).



More information about the Linux-nvme mailing list