[PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

Sinan Kaya okaya at codeaurora.org
Tue Mar 13 15:29:37 PDT 2018


On 3/13/2018 6:00 PM, Logan Gunthorpe wrote:
> 
> 
> On 13/03/18 03:22 PM, Sinan Kaya wrote:
>> It sounds like you have very tight hardware expectations for this to work
>> at this moment. You also don't want to generalize this code for others and
>> address the shortcomings.
> 
> No, that's the way the community has pushed this work. Our original work
> was very general and we were told it was unacceptable to put the onus on
> the user and have things break if the hardware doesn't support it. I
> think that's a reasonable requirement. So the hardware use-cases were
> wittled down to the ones we can be confident about and support with
> reasonable changes to the kernel today.

If hardware doesn't support it, blacklisting should have been the right
path and I still think that you should remove all switch business from the code.
I did not hear enough justification for having a switch requirement
for P2P.

You are also saying that root ports have issues not because of functionality but
because of performance. 

If you know what is bad, create a list and keep adding it. You can't assume
universally that all root ports are broken/ have bad performance.

> 
>> To get you going, you should limit this change to the switch products that you have
>> validated via white-listing PCI vendor/device ids. Please do not enable this feature
>> for all other PCI devices or by default.
> 
> This doesn't seem necessary. We know that all PCIe switches available
> today support P2P and we are highly confident that any switch that would
> ever be produced would support P2P. As long as devices are behind a
> switch you don't need any white listing. This is what the current patch
> set implements. If you want to start including root ports then you will
> need a white list (and solve all the other problems I mentioned earlier).

What if I come up with a very cheap/crappy switch (something like used in data
mining)?

What guarantees that P2P will work with this device? You are making an
assumption here that all switches have good performance.

How is that any different from good switch vs. bad switch and good root
port vs. bad root port?

If it is universally broken, why don't you list the things that work?

> 
>> I think your code qualifies as a virus until this issue is resolved (so NAK). 
> 
> That seems a bit hyperbolic... "a virus"??!... please keep things
> constructive.
> 

Sorry, this was harsh. I'm taking "virus" word back. I apologize. 
But, I hold onto my NAK opinion. 

I have been doing my best to provide feedback. It feels like you are throwing
them over the wall to be honest.

You keep implying "not my problem".

> > I agree disabling globally would be bad. Somebody can always say I have
> > ten switches on my system. I want to do peer-to-peer on one switch only. Now,
> > this change weakened security for the other switches that I had no intention
> > with doing P2P.
> >
> > Isn't this a problem?
> 
> Well, if it's a problem for someone they'll have to solve it. We're
> targeting JBOFs that have no use for ACS / IOMMU groups at all.

IMO, you (not somebody) should address this one way or the other before this
series land in upstream.

>> You are delivering a general purpose P2P code with a lot of holes in it and
>> expecting people to jump through it.
> No, the code prevents users from screwing it up. It just requires a
> switch in the hardware which is hardly a high bar to jump through
> (provided you are putting some design thought into your hardware). And
> given it requires semi-custom hardware today, it's not something that
> needs to be on by default in any distributor kernel.
> 
>> Turning security off by default is also not acceptable. Linux requires ACS support
>> even though you don't care about it for your particular application.
> 
> That's not true. Linux does not require ACS support. In fact it's
> already off by default until you specifically turn on the IOMMU. (Which
> is not always the most obvious thing to enable.) And the only thing it
> really supports is enforcing isolation between VMs that are using
> different pieces of hardware in the system.

Another assumption: There are other architectures like ARM64 where IOMMU
is enabled by default even if you don't use VMs for security reasons.
IOMMU blocks stray transactions.

> 
>> I'd hate ACS to be broken due to some operating system enabling your CONFIG option.
> 
> ACS isn't "broken" by enabling the config option. It just makes the
> IOMMU groups and isolation less granular. (ie. devices behind a switch
> will be in the same group and not isolated from each-other).

Didn't the ACS behavior change suddenly for no good reason when we enabled
your code even though I might not be using the P2P but I happen to have
a kernel with P2P config option?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.



More information about the Linux-nvme mailing list