[PATCH V2 1/2] md: propagate BLK_FEAT_PCI_P2PDMA from member devices

Chaitanya Kulkarni kch at nvidia.com
Wed Apr 8 00:25:36 PDT 2026


From: Kiran Kumar Modukuri <kmodukuri at nvidia.com>

MD RAID does not propagate BLK_FEAT_PCI_P2PDMA from member devices to
the RAID device, preventing peer-to-peer DMA through the RAID layer even
when all underlying devices support it.

Enable BLK_FEAT_PCI_P2PDMA in raid0, raid1 and raid10 personalities
during queue limits setup and clear it in mddev_stack_rdev_limits()
during array init and mddev_stack_new_rdev() during hot-add if any
member device lacks support. Parity RAID personalities (raid4/5/6) are
excluded because they need CPU access to data pages for parity
computation, which is incompatible with P2P mappings.

Tested with RAID0/1/10 arrays containing multiple NVMe devices with P2PDMA
support, confirming that peer-to-peer transfers work correctly through
the RAID layer.

Signed-off-by: Kiran Kumar Modukuri <kmodukuri at nvidia.com>
Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
---
 drivers/md/md.c     | 4 ++++
 drivers/md/raid0.c  | 1 +
 drivers/md/raid1.c  | 1 +
 drivers/md/raid10.c | 1 +
 4 files changed, 7 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 521d9b34cd9e..48d7a3ca8c66 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6176,6 +6176,8 @@ int mddev_stack_rdev_limits(struct mddev *mddev, struct queue_limits *lim,
 		if ((flags & MDDEV_STACK_INTEGRITY) &&
 		    !queue_limits_stack_integrity_bdev(lim, rdev->bdev))
 			return -EINVAL;
+		if (!blk_queue_pci_p2pdma(rdev->bdev->bd_disk->queue))
+			lim->features &= ~BLK_FEAT_PCI_P2PDMA;
 	}
 
 	/*
@@ -6231,6 +6233,8 @@ int mddev_stack_new_rdev(struct mddev *mddev, struct md_rdev *rdev)
 	lim = queue_limits_start_update(mddev->gendisk->queue);
 	queue_limits_stack_bdev(&lim, rdev->bdev, rdev->data_offset,
 				mddev->gendisk->disk_name);
+	if (!blk_queue_pci_p2pdma(rdev->bdev->bd_disk->queue))
+		lim.features &= ~BLK_FEAT_PCI_P2PDMA;
 
 	if (!queue_limits_stack_integrity_bdev(&lim, rdev->bdev)) {
 		pr_err("%s: incompatible integrity profile for %pg\n",
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index ef0045db409f..1cdcafd31744 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -392,6 +392,7 @@ static int raid0_set_limits(struct mddev *mddev)
 	lim.io_opt = lim.io_min * mddev->raid_disks;
 	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
+	lim.features |= BLK_FEAT_PCI_P2PDMA;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
 		return err;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 16f671ab12c0..b25e661e9738 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3192,6 +3192,7 @@ static int raid1_set_limits(struct mddev *mddev)
 	lim.max_hw_wzeroes_unmap_sectors = 0;
 	lim.logical_block_size = mddev->logical_block_size;
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
+	lim.features |= BLK_FEAT_PCI_P2PDMA;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
 		return err;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 4901ebe45c87..07a5b734c8f3 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3939,6 +3939,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
 	lim.chunk_sectors = mddev->chunk_sectors;
 	lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
 	lim.features |= BLK_FEAT_ATOMIC_WRITES;
+	lim.features |= BLK_FEAT_PCI_P2PDMA;
 	err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
 	if (err)
 		return err;
-- 
2.39.5




More information about the Linux-nvme mailing list