[RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Tue Apr 18 14:03:39 PDT 2017


On Tue, Apr 18, 2017 at 12:48:35PM -0700, Dan Williams wrote:

> > Yes, I noticed this problem too and that makes sense. It just means
> > every dma_ops will probably need to be modified to either support p2p
> > pages or fail on them. Though, the only real difficulty there is that it
> > will be a lot of work.
> 
> I don't think you need to go touch all dma_ops, I think you can just
> arrange for devices that are going to do dma to get redirected to a
> p2p aware provider of operations that overrides the system default
> dma_ops. I.e. just touch get_dma_ops().

I don't follow, when does get_dma_ops() return a p2p aware provider?
It has no way to know if the DMA is going to involve p2p, get_dma_ops
is called with the device initiating the DMA.

So you'd always return the P2P shim on a system that has registered
P2P memory?

Even so, how does this shim work? dma_ops are not really intended to
be stacked. How would we make unmap work, for instance? What happens
when the underlying iommu dma ops actually natively understands p2p
and doesn't want the shim?

I think this opens an even bigger can of worms..

Lets find a strategy to safely push this into dma_ops.

What about something more incremental like this instead:
- dma_ops will set map_sg_p2p == map_sg when they are updated to
  support p2p, otherwise DMA on P2P pages will fail for those ops.
- When all ops support p2p we remove the if and ops->map_sg then
  just call map_sg_p2p
- For now the scatterlist maintains a bit when pages are added indicating if
  p2p memory might be present in the list.
- Unmap for p2p and non-p2p is the same, the underlying ops driver has
  to make it work.

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 0977317c6835c2..505ed7d502053d 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -103,6 +103,9 @@ struct dma_map_ops {
 	int (*map_sg)(struct device *dev, struct scatterlist *sg,
 		      int nents, enum dma_data_direction dir,
 		      unsigned long attrs);
+	int (*map_sg_p2p)(struct device *dev, struct scatterlist *sg,
+			  int nents, enum dma_data_direction dir,
+			  unsigned long attrs);
 	void (*unmap_sg)(struct device *dev,
 			 struct scatterlist *sg, int nents,
 			 enum dma_data_direction dir,
@@ -244,7 +247,15 @@ static inline int dma_map_sg_attrs(struct device *dev, struct scatterlist *sg,
 	for_each_sg(sg, s, nents, i)
 		kmemcheck_mark_initialized(sg_virt(s), s->length);
 	BUG_ON(!valid_dma_direction(dir));
-	ents = ops->map_sg(dev, sg, nents, dir, attrs);
+
+	if (sg_has_p2p(sg)) {
+		if (ops->map_sg_p2p)
+			ents = ops->map_sg_p2p(dev, sg, nents, dir, attrs);
+		else
+			return 0;
+	} else
+		ents = ops->map_sg(dev, sg, nents, dir, attrs);
+
 	BUG_ON(ents < 0);
 	debug_dma_map_sg(dev, sg, nents, ents, dir);
 




More information about the Linux-nvme mailing list