[PATCH v7 00/17] Provide a new two step DMA mapping API

Dan Williams dan.j.williams at intel.com
Thu Apr 17 18:20:35 PDT 2025


Jason Gunthorpe wrote:
> On Fri, Mar 21, 2025 at 12:52:30AM +0100, Marek Szyprowski wrote:
> > > Christoph's vision was to make a performance DMA API path that could
> > > be used to implement any scatterlist-like data structure very
> > > efficiently without having to teach the DMA API about all sorts of
> > > scatterlist-like things.
> > 
> > Thanks for explaining one more motivation behind this patchset!
> 
> Sure, no problem.
> 
> To close the loop on the bigger picture here..
> 
> When you put the parts together:
> 
>  1) dma_map_sg is the only API that is both performant and fully
>     functional
> 
>  2) scatterlist is a horrible leaky design and badly misued all over
>     the place. When Logan added SG_DMA_BUS_ADDRESS it became quite
>     clear that any significant changes to scatterlist are infeasible,
>     or at least we'd break a huge number of untestable legacy drivers
>     in the process.
> 
>  3) We really want to do full featured performance DMA *without* a
>     struct page. This requires changing scatterlist, inventing a new
>     scatterlist v2 and DMA map for it, or this idea here of a flexible
>     lower level DMA API entry point.
> 
>     Matthew has been talking about struct-pageless for a long time now
>     from the block/mm direction using folio & memdesc and this is
>     meeting his work from the other end of the stack by starting to
>     build a way to do DMA on future struct pageless things. This is 
>     going to be huge multi-year project but small parts like this need
>     to be solved and agreed to make progress.
> 
>  4) In the immediate moment we still have problems in VFIO, RDMA, and
>     DRM managing P2P transfers because dma_map_resource/page() don't
>     properly work, and we don't have struct pages to use
>     dma_map_sg(). Hacks around the DMA API have been in the kernel for
>     a long time now, we want to see a properly architected solution.

So I am late to this party, but after watching a "modest" proposal of a
DMABUF pfn exporter bounce off the DRM community due to long standing
pain points with scatterlist abuse [1], it is clear to me that a new DMA
mapping API is in the critical path for PCI Device Security
(Confidential Computing: TEE I/O).

Specifically, the confidential computing problem of how to coordinate
the conversion of assigned devices from shared-world to private-world
(including private device MMIO and DMA), needs a "non-scatterlist"
"struct-page-less" mapping contract to describe those resources.

I concede the point that there are gaps missing between this proposal
and the end state needed for PCI Device Security. However, it seems to
be a case of "violent agreement" that some of the benefits of this
proposal only arrive with future work. So this is a necessary first
step.

For my part, I plan to pull this series into a cross-vendor staging tree
for device-security topics [2] so that the PCI Device Security community
can get started on everything that needs to build on top of this.

[1]: http://lore.kernel.org/20250107142719.179636-1-yilun.xu@linux.intel.com
[2]: https://web.git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm.git/



More information about the Linux-nvme mailing list