[PATCH] nvme: zns: limit max_zone_append by max_segments

Shin'ichiro Kawasaki shinichiro.kawasaki at wdc.com
Mon Jul 31 04:46:32 PDT 2023


BIOs for zone append operation can not be split by nature, since device
returns the written sectors of the BIOs. Then the number of segments of
the BIOs shall not be larger than max_segments limit of devices to avoid
bio split due to the number of segments.

However, the nvme driver sets max_zone_append limit referring the limit
obtained from ZNS devices, regardless of max_segments limit of the
devices. This allows zone append BIOs to have large size, and also to
have number of segments larger than the max_segments limit. This causes
unexpected BIO split. It results in WARN in bio_split() followed by
KASAN issues. The recent commit 16d7fd3cfa72 ("zonefs: use iomap for
synchronous direct writes") triggered such BIO split. The commit
modified BIO preparation for zone append operations in zonefs, then
pages for the BIOs are set up not by bio_add_hw_page() but by
bio_iov_add_page(). This prepared each BIO segment to have one page, and
increased the number of segments. Hence the BIO split.

To avoid the unexpected BIO split, reflect the max_segments limit to the
max_zone_append limit. In the worst case, the safe max_zone_append size
is max_segments multiplied by PAGE_SIZE. Compare it with the
max_zone_append size obtained from ZNS devices, and set the smaller
value as the max_zone_append limit.

Fixes: 240e6ee272c0 ("nvme: support for zoned namespaces")
Cc: stable at vger.kernel.org
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki at wdc.com>
---
 drivers/nvme/host/zns.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c
index ec8557810c21..9ee77626c235 100644
--- a/drivers/nvme/host/zns.c
+++ b/drivers/nvme/host/zns.c
@@ -10,9 +10,11 @@
 int nvme_revalidate_zones(struct nvme_ns *ns)
 {
 	struct request_queue *q = ns->queue;
+	unsigned int max_sectors = queue_max_segments(q) << PAGE_SECTORS_SHIFT;
 
 	blk_queue_chunk_sectors(q, ns->zsze);
-	blk_queue_max_zone_append_sectors(q, ns->ctrl->max_zone_append);
+	max_sectors = min(max_sectors, ns->ctrl->max_zone_append);
+	blk_queue_max_zone_append_sectors(q, max_sectors);
 
 	return blk_revalidate_disk_zones(ns->disk, NULL);
 }
-- 
2.40.1




More information about the Linux-nvme mailing list