[PATCH V3 2/3] md: propagate BLK_FEAT_PCI_P2PDMA from member devices to RAID device
Xiao Ni
xni at redhat.com
Tue Apr 21 02:18:48 PDT 2026
On Fri, Apr 17, 2026 at 5:27 AM Chaitanya Kulkarni <kch at nvidia.com> wrote:
>
> From: Kiran Kumar Modukuri <kmodukuri at nvidia.com>
>
> MD RAID does not propagate BLK_FEAT_PCI_P2PDMA from member devices to
> the RAID device, preventing peer-to-peer DMA through the RAID layer even
> when all underlying devices support it.
>
> Enable BLK_FEAT_PCI_P2PDMA unconditionally in raid0, raid1 and raid10
> personalities during queue limits setup. blk_stack_limits() clears it
> automatically if any member device lacks support, consistent with how
> BLK_FEAT_NOWAIT and BLK_FEAT_POLL are handled in the block core.
>
> Parity RAID personalities (raid4/5/6) are excluded because they require
> CPU access to data pages for parity computation, which is incompatible
> with P2P mappings.
>
> Tested with RAID0/1/10 arrays containing multiple NVMe devices with
> P2PDMA support, confirming that peer-to-peer transfers work correctly
> through the RAID layer.
>
> Signed-off-by: Kiran Kumar Modukuri <kmodukuri at nvidia.com>
> Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
> ---
> drivers/md/raid0.c | 1 +
> drivers/md/raid1.c | 1 +
> drivers/md/raid10.c | 1 +
> 3 files changed, 3 insertions(+)
>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 5e38a51e349a..2cdaf7495d92 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -392,6 +392,7 @@ static int raid0_set_limits(struct mddev *mddev)
> lim.io_opt = lim.io_min * mddev->raid_disks;
> lim.chunk_sectors = mddev->chunk_sectors;
> lim.features |= BLK_FEAT_ATOMIC_WRITES;
> + lim.features |= BLK_FEAT_PCI_P2PDMA;
> err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
> if (err)
> return err;
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index ba91f7e61920..422ad4786569 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -3215,6 +3215,7 @@ static int raid1_set_limits(struct mddev *mddev)
> lim.max_hw_wzeroes_unmap_sectors = 0;
> lim.logical_block_size = mddev->logical_block_size;
> lim.features |= BLK_FEAT_ATOMIC_WRITES;
> + lim.features |= BLK_FEAT_PCI_P2PDMA;
> err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
> if (err)
> return err;
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 4901ebe45c87..07a5b734c8f3 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -3939,6 +3939,7 @@ static int raid10_set_queue_limits(struct mddev *mddev)
> lim.chunk_sectors = mddev->chunk_sectors;
> lim.io_opt = lim.io_min * raid10_nr_stripes(conf);
> lim.features |= BLK_FEAT_ATOMIC_WRITES;
> + lim.features |= BLK_FEAT_PCI_P2PDMA;
> err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY);
> if (err)
> return err;
> --
> 2.39.5
>
>
Thanks for enabling this feature. Looks good to me.
Reviewed-by: Xiao Ni <xni at redhat.com>
More information about the Linux-nvme
mailing list