[PATCH V4 2/9] nvmet: add ZNS support for bdev-ns

Chaitanya Kulkarni Chaitanya.Kulkarni at wdc.com
Thu Dec 3 22:13:40 EST 2020


On 12/2/20 01:07, Christoph Hellwig wrote:
> On Tue, Dec 01, 2020 at 10:22:20PM -0800, Chaitanya Kulkarni wrote:
>> Add zns-bdev-config, id-ctrl, id-ns, zns-cmd-effects, zone-mgmt-send,
>> zone-mgmt-recv and zone-append handlers
> If you think you need to speel out command please do so as in the spec
> instead of uisng strange abbreviations.
>
> That being said the commit log should document the why and the overall
> architecture considerations, not low-level details like this.

The commit log is really bad, let me rewrite in a more meaningful way.

>
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +	struct nvme_id_ns_zns	id_zns;
>> +	unsigned int		zasl;
>> +#endif
> This wastes a lot of space for all normal non-zns uses.  Please have some
> sort of private data field that zns can use.  Why do we even need to store
> the whole data structure instead of the few native format fields we need?
No need, this should use the fields than data structure.
>>  static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
>> @@ -251,6 +255,10 @@ struct nvmet_subsys {
>>  	unsigned int		admin_timeout;
>>  	unsigned int		io_timeout;
>>  #endif /* CONFIG_NVME_TARGET_PASSTHRU */
>> +
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +	struct nvme_id_ctrl_zns	id_ctrl_zns;
>> +#endif
> Same here.
Yep.
>
>> +#define NVMET_MPSMIN_SHIFT	12
> Needs some documentation on the why.
Okay.
>> +static inline struct block_device *nvmet_bdev(struct nvmet_req *req)
>> +{
>> +	return req->ns->bdev;
>> +}
> I don't really see the point of this helper.
Okay, I'll remove it.
>> +/*
>> + *  ZNS related command implementation and helpers.
>> + */
>> +
>> +u16 nvmet_process_zns_cis(struct nvmet_req *req, off_t *off)
>> +{
>> +	u16 nvme_cis_zns = NVME_CSI_ZNS;
>> +
>> +	if (!bdev_is_zoned(nvmet_bdev(req)))
>> +		return NVME_SC_SUCCESS;
>> +
>> +	return nvmet_copy_ns_identifier(req, NVME_NIDT_CSI, NVME_NIDT_CSI_LEN,
>> +					&nvme_cis_zns, off);
>> +}
> This looks weird.  We can want to support the command set identifier in
> general, so this should go into common code, and just look up the command
> set identifier in the nvmet_ns structure.

ZNS is the only user for this, so I've added to the zns code. I'll move to

admin-cmd.

>> +void nvmet_zns_add_cmd_effects(struct nvme_effects_log *log)
>> +{
>> +	log->iocs[nvme_cmd_zone_append]		= cpu_to_le32(1 << 0);
>> +	log->iocs[nvme_cmd_zone_mgmt_send]	= cpu_to_le32(1 << 0);
>> +	log->iocs[nvme_cmd_zone_mgmt_recv]	= cpu_to_le32(1 << 0);
>> +}
> Just add this to the caller under an if.  For the if a helper that checks
> IS_ENABLED() and the csi would be useful.
Okay.
>> +
>> +static inline bool nvmet_bdev_validate_zns_zones(struct nvmet_ns *ns)
>> +{
>> +	if (ns->bdev->bd_disk->queue->conv_zones_bitmap) {
>> +		pr_err("block devices with conventional zones are not supported.");
>> +		return false;
>> +	}
>> +
>> +	return !(get_capacity(ns->bdev->bd_disk) &
>> +			(bdev_zone_sectors(ns->bdev) - 1));
>> +}
> I think this should be open coded in the caller.
>
Okay.
>> +static inline u8 nvmet_zasl(unsigned int zone_append_sects)
>> +{
>> +	/*
>> +	 * Zone Append Size Limit is the value experessed in the units
>> +	 * of minimum memory page size (i.e. 12) and is reported power of 2.
>> +	 */
>> +	return ilog2((zone_append_sects << 9) >> NVMET_MPSMIN_SHIFT);
>> +}
>> +
>> +static inline void nvmet_zns_update_zasl(struct nvmet_ns *ns)
>> +{
>> +	struct request_queue *q = ns->bdev->bd_disk->queue;
>> +	struct nvmet_ns *ins;
>> +	unsigned long idx;
>> +	u8 min_zasl;
>> +
>> +	/*
>> +	 * Calculate new ctrl->zasl value when enabling the new ns. This value
>> +	 * has to be the minimum of the max_zone_append values from available
>> +	 * namespaces.
>> +	 */
>> +	min_zasl = ns->zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
>> +
>> +	xa_for_each(&(ns->subsys->namespaces), idx, ins) {
>> +		struct request_queue *iq = ins->bdev->bd_disk->queue;
>> +		unsigned int imax_za_sects = queue_max_zone_append_sectors(iq);
>> +		u8 izasl = nvmet_zasl(imax_za_sects);
>> +
>> +		if (!bdev_is_zoned(ins->bdev))
>> +			continue;
>> +
>> +		min_zasl = min_zasl > izasl ? izasl : min_zasl;
>> +	}
>> +
>> +	ns->subsys->id_ctrl_zns.zasl = min_zasl;
>> +}
> This will change the limit when a new namespaces is added.  I think we need
> to just pick the value of the first namespaces and refuse adding a new
> one if the limit is lower to not completely break hosts.

But that will force users to add ns with highest zasl first no matter what.

Isn't there should be a way to update the host with async event

so that host can refresh the ctrl->zasl when ns addition async notification

is generated ?

>> +void nvmet_bdev_execute_zone_append(struct nvmet_req *req)
> This whole function looks weird.  I'd expect that we mostly (if not
> entirely) reuse nvmet_bdev_execute_rw, just using bio_add_hw_page
> instead of bio_add_page and setting up the proper op field.

that was the initial choice, please see the reply to the your comment

on the first patch.

>> +#else  /* CONFIG_BLK_DEV_ZONED */
> We really do try to avoid these kinds of ifdefs in .c files.  Either
> put the stubs inline into the header, of use IS_ENABLED() magic in
> the callers.  In this case I think the new helper I mentioned above
> which checks IS_ENABLED + the csi seems like the right way.
>
With addition of the these empty stubs these functions are getting added

to all the transport code which has nothing to with the backend that felt

wrong to me. But if you are okay with that I'll make this change.






More information about the Linux-nvme mailing list