[PATCHv2] nvme: only use power of two io boundaries

Damien Le Moal Damien.LeMoal at wdc.com
Fri Aug 28 00:30:02 EDT 2020


On 2020/08/28 13:16, Keith Busch wrote:
> On Fri, Aug 28, 2020 at 01:03:30AM +0000, Damien Le Moal wrote:
>> On 2020/08/28 0:35, Keith Busch wrote:
>>> The kernel requires a power of two for boundaries because that's the
>>> only way it can efficiently split commands that cross them. A
>>> controller, however, may report a non-power of two boundary.
>>
>> I can understand that using a power of 2 limit for splitting can be more
>> efficient CPU-wise, but I do not think the kernel "requires" such a limit
>> alignment for splitting BIOs. BLK_SAFE_MAX_SECTORS is 255 after all. What I am
>> missing here ?
> 
> Ah, those two limits are a bit different. The max sectors just checks
> length, but chunk_sectors needs to consider length and offset, so
> checking if an IO crosses the boundary requires division. For CPU
> efficiency, power of 2 alignment is enforced through a BUG_ON() in
> blk_queue_chunk_sectors().

OK. Got it.

>  
>>> -	if (iob && !blk_queue_is_zoned(ns->queue))
>>> -		blk_queue_chunk_sectors(ns->queue, rounddown_pow_of_two(iob));
>>> +	if (is_power_of_2(iob) && !blk_queue_is_zoned(ns->queue))
>>> +		blk_queue_chunk_sectors(ns->queue, iob);
>>> +	else if (iob && !(disk->flags & GENHD_FL_UP))
>>> +		dev_warn(ctrl->device, "namespace:%u has unused io boundary:%u\n",
>>> +			ns->head->ns_id, iob);
>>
>> Isn't this going to generate a warning all the time for a ZNS drive ?
> 
> Yes, if a ZNS drive reports NOIOB, this will warn on it's initial
> discovery.
> 
> As long as zone sizes reuse the queue's chunk_sectors, the zone size
> needs to take precedence since it's more than just a hint compared to
> NOIOB. If people complain about the warning, that may indicate we ought
> to find a way to get around overloading the chunk_sectors queue limit,
> but I suspect ZNS drives won't report an iob value.

Understood. But for ZNS, since the zone size would take precedence over iob for
chunk sectors, wouldn't it make sense to have the entire code block under a
if (!blk_queue_is_zoned(ns->queue)) ? Something like this:

	if (!blk_queue_is_zoned(ns->queue)) {
		if (is_power_of_2(iob))
			blk_queue_chunk_sectors(ns->queue, iob);
		else if (iob && !(disk->flags & GENHD_FL_UP))
			dev_warn(ctrl->device,
				"namespace:%u has unused io boundary:%u\n",
				ns->head->ns_id, iob);
	}

Thanks !

-- 
Damien Le Moal
Western Digital Research



More information about the Linux-nvme mailing list