[PATCH v9 17/24] vfio/mlx5: Enable the DMA link API
Jason Gunthorpe
jgg at ziepe.ca
Wed Apr 23 11:09:41 PDT 2025
On Wed, Apr 23, 2025 at 11:13:08AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro at nvidia.com>
>
> Remove intermediate scatter-gather table completely and
> enable new DMA link API.
>
> Tested-by: Jens Axboe <axboe at kernel.dk>
> Signed-off-by: Leon Romanovsky <leonro at nvidia.com>
> ---
> drivers/vfio/pci/mlx5/cmd.c | 298 ++++++++++++++++-------------------
> drivers/vfio/pci/mlx5/cmd.h | 21 ++-
> drivers/vfio/pci/mlx5/main.c | 31 ----
> 3 files changed, 147 insertions(+), 203 deletions(-)
Reviewed-by: Jason Gunthorpe <jgg at nvidia.com>
> +static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages,
> + struct page **page_list, u32 *mkey_in,
> + struct dma_iova_state *state,
> + enum dma_data_direction dir)
> +{
> + dma_addr_t addr;
> + size_t mapped = 0;
> + __be64 *mtt;
> + int i, err;
>
> - return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen);
> + WARN_ON_ONCE(dir == DMA_NONE);
> +
> + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt);
> +
> + if (dma_iova_try_alloc(mdev->device, state, 0, npages * PAGE_SIZE)) {
> + addr = state->addr;
> + for (i = 0; i < npages; i++) {
> + err = dma_iova_link(mdev->device, state,
> + page_to_phys(page_list[i]), mapped,
> + PAGE_SIZE, dir, 0);
> + if (err)
> + goto error;
> + *mtt++ = cpu_to_be64(addr);
> + addr += PAGE_SIZE;
> + mapped += PAGE_SIZE;
> + }
This is an area I'd like to see improvement on as a follow up.
Given we know we are allocating contiguous IOVA we should be able to
request a certain alignment so we can know that it can be put into the
mkey as single mtt. That would eliminate the double translation cost in
the HW.
The RDMA mkey builder is able to do this from the scatterlist but the
logic to do that was too complex to copy into vfio. This is close to
being simple enough, just the alignment is the only problem.
Jason
More information about the Linux-nvme
mailing list