[PATCH v1 00/17] Provide a new two step DMA mapping API
Jason Gunthorpe
jgg at ziepe.ca
Thu Nov 7 05:28:08 PST 2024
On Thu, Nov 07, 2024 at 09:32:56AM +0100, Christoph Hellwig wrote:
> On Tue, Nov 05, 2024 at 03:53:57PM -0400, Jason Gunthorpe wrote:
> > > Yeah, I don't really get the struct page argument. In fact if we look
> > > at the nitty-gritty details of dma_map_page it doesn't really need a
> > > page at all.
> >
> > Today, if you want to map a P2P address you must have a struct page,
> > because page->pgmap is the only source of information on the P2P
> > topology.
> >
> > So the logic is, to get P2P without struct page we need a way to have
> > all the features of dma_map_sg() but without a mandatory scatterlist
> > because we cannot remove struct page from scatterlist.
>
> Well, that is true but also not the point. The hard part is to
> find the P2P routing information without the page. After that
> any physical address based interface will work, including a trivial
> dma_map_phys.
Once we are freed from scatterlist we can explore a design that would
pass the P2P routing information directly. For instance imagine
something like:
dma_map_p2p(dev, phys, p2p_provider);
Then dma_map_page(dev, page) could be something like
if (is_pci_p2pdma_page(page))
dev_map_p2p(dev, page_to_phys(page), page->pgmap->p2p_provider)
>From there we could then go into DRM/VFIO/etc and give them
p2p_providers without pgmaps. p2p_provider is some light refactoring
of what is already in drivers/pci/p2pdma.c
For the dmabuf use cases it is not actually hard to find the P2P
routing information - the driver constructing the dmabuf has it. The
challenge is carrying that information from the originating driver,
through the dmabuf apis to the final place that does the dma mapping.
So I'm thinking of a datastructure for things like dmabuf/rdma MR
that is sort of like this:
struct phys_list {
enum type; // CPU, p2p, encrypted, whatever
struct p2p_provider *p2p_provider;
struct phys_list *next;
struct phys_range frags[];
}
Where each phys_list would be a single uniform dma operation and
easily carries the extra meta data. No struct page, no serious issue
transfering the P2P routing information.
> > I saw the Intel XE team make a complicated integration with the DMA
> > API that wasn't so good. They were looking at an earlier version of
> > this and I think the feedback was positive. It should make a big
> > difference, but we will need to see what they come up and possibly
> > tweak things.
>
> Not even sure what XE is, but do you have a pointer to it? It would
> really be great if people having DMA problems talked to the dma-mapping
> and iommu maintaines / list..
GPU driver
https://lore.kernel.org/dri-devel/20240117221223.18540-7-oak.zeng@intel.com/
Jason
More information about the Linux-nvme
mailing list