[RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

Knut Omang knut.omang at oracle.com
Mon Apr 24 00:36:37 PDT 2017


On Mon, 2017-04-17 at 08:31 +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2017-04-16 at 10:34 -0600, Logan Gunthorpe wrote:
>> > On 16/04/17 09:53 AM, Dan Williams wrote:
> > > ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> > > context about the physical address in question. I'm thinking you can
> > > hang bus address translation data off of that structure. This seems
> > > vaguely similar to what HMM is doing.
>> > Thanks! I didn't realize you had the infrastructure to look up a device
> > from a pfn/page. That would really come in handy for us.
> 
> It does indeed. I won't be able to play with that much for a few weeks
> (see my other email) so if you're going to tackle this while I'm away,
> can you work with Jerome to make sure you don't conflict with HMM ?
> 
> I really want a way for HMM to be able to layout struct pages over the
> GPU BARs rather than in "allocated free space" for the case where the
> BAR is big enough to cover all of the GPU memory.
> 
> In general, I'd like a simple & generic way for any driver to ask the
> core to layout DMA'ble struct pages over BAR space. I an not convinced
> this requires a "p2mem device" to be created on top of this though but
> that's a different discussion.
> 
> Of course the actual ability to perform the DMA mapping will be subject
> to various restrictions that will have to be implemented in the actual
> "dma_ops override" backend. We can have generic code to handle the case
> where devices reside on the same domain, which can deal with switch
> configuration etc... we will need to have iommu specific code to handle
> the case going through the fabric. 
> 
> Virtualization is a separate can of worms due to how qemu completely
> fakes the MMIO space, we can look into that later.

My first reflex when reading this thread was to think that this whole domain
lends it self excellently to testing via Qemu. Could it be that doing this in 
the opposite direction might be a safer approach in the long run even though 
(significant) more work up-front?

Eg. start by fixing/providing/documenting suitable model(s) 
for testing this in Qemu, then implement the patch set based 
on those models?

Thanks,
Knut

> 
> Cheers,
> Ben.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the Linux-nvme mailing list