[PATCH 07/13] block: track zone conditions
Damien Le Moal
dlemoal at kernel.org
Mon Nov 3 14:34:30 PST 2025
On 11/4/25 03:31, Bart Van Assche wrote:
> On 11/3/25 7:48 AM, Bart Van Assche wrote:
>> On 11/2/25 10:05 PM, Damien Le Moal wrote:
>>> On 11/1/25 06:17, Bart Van Assche wrote:
>>>> On 10/30/25 11:13 PM, Damien Le Moal wrote:
>>>>> Implement tracking of the runtime changes to zone conditions using
>>>>> the new cond field in struct blk_zone_wplug. The size of this structure
>>>>> remains 112 Bytes as the new field replaces the 4 Bytes padding at the
>>>>> end of the structure. For zones that do not have a zone write plug, the
>>>>> zones_cond array of a disk is used to track changes to zone conditions,
>>>>> e.g. when a zone reset, reset all or finish operation is executed.
>>>>
>>>> Why is it necessary to track the condition of sequential zones that do
>>>> not have a zone write plug? Please explain what the use cases are.
>>>
>>> Because zones that do not have a zone write plug can be empty OR full.
>>
>> Why does the block layer have to track this information? Filesystems can
>> easily derive this information from the filesystem metadata information,
>> isn't it?
>
> (replying to my own email)
>
> Is this a good way to check what zone type information filesystems need?
>
> $ git grep -nH BLK_ZONE_TYPE_ fs
> fs/btrfs/zoned.c:96: ASSERT(zones[i].type != BLK_ZONE_TYPE_CONVENTIONAL);
> fs/btrfs/zoned.c:211: zones[i].type = BLK_ZONE_TYPE_CONVENTIONAL;
> fs/btrfs/zoned.c:488: if (zones[i].type == BLK_ZONE_TYPE_SEQWRITE_REQ)
> fs/btrfs/zoned.c:566: BLK_ZONE_TYPE_CONVENTIONAL)
> fs/btrfs/zoned.c:815: if (zones[0].type == BLK_ZONE_TYPE_CONVENTIONAL) {
> fs/btrfs/zoned.c:1360: if (unlikely(zone.type ==
> BLK_ZONE_TYPE_CONVENTIONAL)) {
> fs/f2fs/segment.c:5295: if (zone->type != BLK_ZONE_TYPE_SEQWRITE_REQ)
> fs/f2fs/segment.c:5417: if (zone.type != BLK_ZONE_TYPE_SEQWRITE_REQ)
> fs/f2fs/segment.c:5473: if (zone.type != BLK_ZONE_TYPE_SEQWRITE_REQ)
> fs/f2fs/super.c:4332: if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL)
> fs/xfs/libxfs/xfs_zones.c:177: case BLK_ZONE_TYPE_CONVENTIONAL:
> fs/xfs/libxfs/xfs_zones.c:179: case BLK_ZONE_TYPE_SEQWRITE_REQ:
> fs/zonefs/super.c:385: zone.type = BLK_ZONE_TYPE_CONVENTIONAL;
> fs/zonefs/super.c:874: case BLK_ZONE_TYPE_CONVENTIONAL:
> fs/zonefs/super.c:886: case BLK_ZONE_TYPE_SEQWRITE_REQ:
> fs/zonefs/super.c:887: case BLK_ZONE_TYPE_SEQWRITE_PREF:
> fs/zonefs/zonefs.h:26: * defined in linux/blkzoned.h, that is,
> BLK_ZONE_TYPE_SEQWRITE_REQ and
> fs/zonefs/zonefs.h:27: * BLK_ZONE_TYPE_SEQWRITE_PREF.
> fs/zonefs/zonefs.h:37: if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL)
>
> In the above I see that all filesystems check for the following zone
> types and don't check whether a zone is empty or full:
> * CONVENTIONAL
> * SEQWRITE_REQ
> * SEQWRITE_PREF
>
> Do you agree with this conclusion?
Absolutely not.
git grep -nH BLK_ZONE_COND_ fs
fs/btrfs/zoned.c:75: return (zone->cond == BLK_ZONE_COND_FULL) ||
fs/btrfs/zoned.c:97: empty[i] = (zones[i].cond == BLK_ZONE_COND_EMPTY);
fs/btrfs/zoned.c:212: zones[i].cond = BLK_ZONE_COND_NOT_WP;
fs/btrfs/zoned.c:491: case BLK_ZONE_COND_EMPTY:
fs/btrfs/zoned.c:494: case BLK_ZONE_COND_IMP_OPEN:
fs/btrfs/zoned.c:495: case BLK_ZONE_COND_EXP_OPEN:
fs/btrfs/zoned.c:496: case BLK_ZONE_COND_CLOSED:
fs/btrfs/zoned.c:497: case BLK_ZONE_COND_ACTIVE:
fs/btrfs/zoned.c:833: if (reset && reset->cond != BLK_ZONE_COND_EMPTY) {
fs/btrfs/zoned.c:845: reset->cond = BLK_ZONE_COND_EMPTY;
fs/btrfs/zoned.c:967: if (zone->cond == BLK_ZONE_COND_FULL) {
fs/btrfs/zoned.c:972: if (zone->cond == BLK_ZONE_COND_EMPTY)
fs/btrfs/zoned.c:973: zone->cond = BLK_ZONE_COND_IMP_OPEN;
fs/btrfs/zoned.c:1000: zone->cond = BLK_ZONE_COND_FULL;
fs/btrfs/zoned.c:1373: case BLK_ZONE_COND_OFFLINE:
fs/btrfs/zoned.c:1374: case BLK_ZONE_COND_READONLY:
fs/btrfs/zoned.c:1381: case BLK_ZONE_COND_EMPTY:
fs/btrfs/zoned.c:1384: case BLK_ZONE_COND_FULL:
fs/f2fs/segment.c:5319: if ((!valid_block_cnt && zone->cond ==
BLK_ZONE_COND_EMPTY) ||
fs/f2fs/segment.c:5320: (valid_block_cnt && zone->cond == BLK_ZONE_COND_FULL))
fs/xfs/libxfs/xfs_zones.c:93: case BLK_ZONE_COND_EMPTY:
fs/xfs/libxfs/xfs_zones.c:95: case BLK_ZONE_COND_IMP_OPEN:
fs/xfs/libxfs/xfs_zones.c:96: case BLK_ZONE_COND_EXP_OPEN:
fs/xfs/libxfs/xfs_zones.c:97: case BLK_ZONE_COND_CLOSED:
fs/xfs/libxfs/xfs_zones.c:99: case BLK_ZONE_COND_FULL:
fs/xfs/libxfs/xfs_zones.c:101: case BLK_ZONE_COND_NOT_WP:
fs/xfs/libxfs/xfs_zones.c:102: case BLK_ZONE_COND_OFFLINE:
fs/xfs/libxfs/xfs_zones.c:103: case BLK_ZONE_COND_READONLY:
fs/xfs/libxfs/xfs_zones.c:122: case BLK_ZONE_COND_NOT_WP:
fs/xfs/xfs_zone_alloc.c:985: if (!zone || zone->cond == BLK_ZONE_COND_NOT_WP) {
fs/zonefs/super.c:195: case BLK_ZONE_COND_OFFLINE:
fs/zonefs/super.c:200: case BLK_ZONE_COND_READONLY:
fs/zonefs/super.c:215: case BLK_ZONE_COND_FULL:
fs/zonefs/super.c:386: zone.cond = BLK_ZONE_COND_NOT_WP;
fs/zonefs/super.c:986: if (next->cond ==
BLK_ZONE_COND_READONLY &&
fs/zonefs/super.c:987: zone->cond !=
BLK_ZONE_COND_OFFLINE)
fs/zonefs/super.c:988: zone->cond =
BLK_ZONE_COND_READONLY;
fs/zonefs/super.c:989: else if (next->cond ==
BLK_ZONE_COND_OFFLINE)
fs/zonefs/super.c:990: zone->cond =
BLK_ZONE_COND_OFFLINE;
fs/zonefs/super.c:1034: (zone->cond == BLK_ZONE_COND_IMP_OPEN ||
fs/zonefs/super.c:1035: zone->cond == BLK_ZONE_COND_EXP_OPEN)) {
And if you are still not convinced, read the mount code for XFS and BTRFS.
You'll see the point of having a fast cached report zones to speed that up.
--
Damien Le Moal
Western Digital Research
More information about the Linux-nvme
mailing list