[PATCH v3 00/15] Introduce cached report zones

Damien Le Moal dlemoal at kernel.org
Tue Nov 4 12:59:51 PST 2025


On 11/4/25 23:41, Christoph Hellwig wrote:
> I just threw this into my xfstests setup, and it seems this version
> is broken somehow.  Running on emulated ZNS devices with XFS I get
> a lot of failures with warnings like this:
> 
> [   30.068652] XFS (nvme1n1): empty zone 1 has non-zero used counter (0x1).
> 
> [   49.316873] XFS (nvme0n1): empty zone 2 has non-zero used counter (0x10).
> 
> so it seems like it's not tracking WPs correctly, probably when using
> zone append and unmount/remounting.

Aouch ! I totally missed that !

First problem, trivial, I am missing this in the xfs patch:

diff --git a/fs/xfs/libxfs/xfs_zones.c b/fs/xfs/libxfs/xfs_zones.c
index b0791a71931c..b40f71f878b5 100644
--- a/fs/xfs/libxfs/xfs_zones.c
+++ b/fs/xfs/libxfs/xfs_zones.c
@@ -95,6 +95,7 @@ xfs_zone_validate_seq(
        case BLK_ZONE_COND_IMP_OPEN:
        case BLK_ZONE_COND_EXP_OPEN:
        case BLK_ZONE_COND_CLOSED:
+       case BLK_ZONE_COND_ACTIVE:
                return xfs_zone_validate_wp(zone, rtg, write_pointer);
        case BLK_ZONE_COND_FULL:
                return xfs_zone_validate_full(zone, rtg, write_pointer);

Second problem is a little more subtle: when disk_insert_zone_wplug() is called
from blk_revalidate_disk_zones() callback for a non empty zone, the zone
condition is set to BLK_ZONE_COND_NOT_WP if we do not have a disk zones_cond
array. But we never have that on the first call to blk_revalidate_disk_zones()
since that function sets it at the end of the revalidation. So we need this
change to also avoid mount errors after a reboot for an FS with partially
written zones:

 diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index bf6495f0d49f..bba64b427082 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -527,12 +527,20 @@ static bool disk_insert_zone_wplug(struct gendisk *disk,
                        return false;
                }
        }
+
+       /*
+        * Set the zone condition: if we do not yet have a zones_cond array
+        * attached to the disk, then this is a zone write plug insert from the
+        * first call to blk_revalidate_disk_zones(), in which case the zone is
+        * necessarilly in the active condition.
+        */
        zones_cond = rcu_dereference_check(disk->zones_cond,
                                lockdep_is_held(&disk->zone_wplugs_lock));
        if (zones_cond)
                zwplug->cond = zones_cond[zwplug->zone_no];
        else
-               zwplug->cond = BLK_ZONE_COND_NOT_WP;
+               zwplug->cond = BLK_ZONE_COND_ACTIVE;
+
        hlist_add_head_rcu(&zwplug->node, &disk->zone_wplugs_hash[idx]);
        atomic_inc(&disk->nr_zone_wplugs);
        spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);

I will integrate these fixes in v5 and resend.




-- 
Damien Le Moal
Western Digital Research



More information about the Linux-nvme mailing list