[PATCH V4 2/3] md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device
Chaitanya Kulkarni
kch at nvidia.com
Wed May 13 11:51:52 PDT 2026
From: Kiran Kumar Modukuri <kmodukuri at nvidia.com>
MD RAID does not propagate BLK_FEAT_PCI_P2PDMA from member devices to
the RAID device, preventing peer-to-peer DMA through the RAID layer even
when all underlying devices support it.
Enable BLK_FEAT_PCI_P2PDMA unconditionally in raid0, raid1 and raid10
personalities during queue limits setup. blk_stack_limits() clears it
automatically if any member device lacks support, consistent with how
BLK_FEAT_NOWAIT and BLK_FEAT_POLL are handled in the block core.
Parity RAID personalities (raid4/5/6) are excluded because they require
CPU access to data pages for parity computation, which is incompatible
with P2P mappings.
Tested with RAID0/1/10 arrays containing multiple NVMe devices with
P2PDMA support, confirming that peer-to-peer transfers work correctly
through the RAID layer.
Tested-by: Pranjal Shrivastava<praan at google.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagi at grimberg.me>
Reviewed-by: Xiao Ni <xni at redhat.com>
Signed-off-by: Kiran Kumar Modukuri <kmodukuri at nvidia.com>
Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
---
drivers/md/raid0.c | 1 +
drivers/md/raid1.c | 1 +
drivers/md/raid10.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 5e38a51e349a..2cdaf7495d92 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -392,6 +392,7 @@ static int raid0_set_limits(struct mddev *mddev)
lim.io_opt = lim.io_min * mddev->raid_disks;
lim.chunk_sectors = mddev->chunk_sectors;
lim.features |= BLK_FEAT_ATOMIC_WRITES;
+ lim.features |= BLK_FEAT_PCI_P2PDMA;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err)
return err;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 64d970e2ef50..cc628a1be52c 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3208,6 +3208,7 @@ static int raid1_set_limits(struct mddev *mddev)
lim.max_hw_wzeroes_unmap_sectors = 0;
lim.logical_block_size = mddev->logical_block_size;
lim.features |= BLK_FEAT_ATOMIC_WRITES;
+ lim.features |= BLK_FEAT_PCI_P2PDMA;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err)
return err;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 39085e7dd6d2..f905dc391b74 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3941,6 +3941,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
lim.chunk_sectors = mddev->chunk_sectors;
lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
lim.features |= BLK_FEAT_ATOMIC_WRITES;
+ lim.features |= BLK_FEAT_PCI_P2PDMA;
err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
if (err)
return err;
--
2.39.5
More information about the Linux-nvme
mailing list