[PATCH 6/6] nvme: Add consistency check for zone count

Damien Le Moal Damien.LeMoal at wdc.com
Fri Jun 26 02:49:53 EDT 2020


On 2020/06/26 15:13, Javier González wrote:
> On 26.06.2020 00:04, Damien Le Moal wrote:
>> On 2020/06/26 6:49, Keith Busch wrote:
>>> On Thu, Jun 25, 2020 at 02:21:52PM +0200, Javier González wrote:
>>>>  drivers/nvme/host/zns.c | 7 +++++++
>>>>  1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
>>>> index 7d8381fe7665..de806788a184 100644
>>>> --- a/drivers/nvme/host/zns.c
>>>> +++ b/drivers/nvme/host/zns.c
>>>> @@ -234,6 +234,13 @@ static int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector,
>>>>  		sector += ns->zsze * nz;
>>>>  	}
>>>>
>>>> +	if (nr_zones < 0 && zone_idx != ns->nr_zones) {
>>>> +		dev_err(ns->ctrl->device, "inconsistent zone count %u/%u\n",
>>>> +				zone_idx, ns->nr_zones);
>>>> +		ret = -EINVAL;
>>>> +		goto out_free;
>>>> +	}
>>>> +
>>>>  	ret = zone_idx;
>>>
>>> nr_zones is unsigned, so it's never < 0.
>>>
>>> The API we're providing doesn't require zone_idx equal the namespace's
>>> nr_zones at the end, though. A subset of the total number of zones can
>>> be requested here.
>>>
> 
> I did see nr_zones coming with -1; guess it is my compiler.

See include/linux/blkdev.h. -1 is:

#define BLK_ALL_ZONES  ((unsigned int)-1)

Which is documented in block/blk-zoned.c:

/**
 * blkdev_report_zones - Get zones information
 * @bdev:       Target block device
 * @sector:     Sector from which to report zones
 * @nr_zones:   Maximum number of zones to report
 * @cb:         Callback function called for each reported zone
 * @data:       Private data for the callback
 *
 * Description:
 *    Get zone information starting from the zone containing @sector for at most
 *    @nr_zones, and call @cb for each zone reported by the device.
 *    To report all zones in a device starting from @sector, the BLK_ALL_ZONES
 *    constant can be passed to @nr_zones.
 *    Returns the number of zones reported by the device, or a negative errno
 *    value in case of failure.
 *
 *    Note: The caller must use memalloc_noXX_save/restore() calls to control
 *    memory allocations done within this function.
 */
int blkdev_report_zones(struct block_device *bdev, sector_t sector,
                        unsigned int nr_zones, report_zones_cb cb, void *data)

> 
>>
>> Yes, absolutely. zone_idx is not an absolute zone number. It is the index of the
>> reported zone descriptor in the current report range requested by the user,
>> which is not necessarily for the entire drive (i.e., provided nr zones is less
>> than the total number of zones of the disk and/or start sector is > 0). So
>> zone_idx indicates the actual number of zones reported, it is not the total
> 
> I see. As I can see, when nr_zones comes undefined I believed we could
> assume that zone_idx is absolute, but I can be wrong.

No. zone_idx is *always* the index of the zone in the current report. Whatever
that report is, regardless of the report starting point and number of zones
requested. E.g. For a single zone report (nr_zones = 1), you will always see
zone_idx = 0. For a full report, zone_idx will correspond to the zone number.
This is used for example in blk_revalidate_disk_zones() to initialize the zone
bitmaps.

> Does it make sense to support this check with an additional counter and
> a explicit nr_zones initialization when undefined or you
> prefer to just remove it as Matias suggested?

The check is not needed at all.

If the device is buggy and reports more zones than the device capacity or any
other bugs, the driver can catch that when it processes the report.
blk_revalidate_disk_zones() also has many checks.


-- 
Damien Le Moal
Western Digital Research



More information about the Linux-nvme mailing list