[PATCH 2/2] iommu: add warning when sharing groups

Sinan Kaya okaya at codeaurora.org
Wed Feb 15 13:43:17 PST 2017


On 2/15/2017 2:36 PM, Alex Williamson wrote:
> On Tue, 14 Feb 2017 22:53:35 -0500
> okaya at codeaurora.org wrote:
> 
>> On 2017-02-14 18:51, Alex Williamson wrote:
>>> On Tue, 14 Feb 2017 16:25:22 -0500
>>> Sinan Kaya <okaya at codeaurora.org> wrote:
>>>   
>>>> The ACS requirement has been obscured in the current code and is only
>>>> known by certain individuals who happen to read the code. Print out a
>>>> warning with ACS path failure when ACS requirement is not met.
>>>>
>>>> Signed-off-by: Sinan Kaya <okaya at codeaurora.org>
>>>> ---
>>>>  drivers/iommu/iommu.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index dbe7f65..049ee0a 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -811,6 +811,9 @@ struct iommu_group *pci_device_group(struct device 
>>>> *dev)
>>>>  	if (IS_ERR(group))
>>>>  		return NULL;
>>>>
>>>> +	if (pci_is_root_bus(bus))
>>>> +		dev_warn_once(&pdev->dev, "using shared group due to ACS path 
>>>> failure\n");
>>>> +
>>>>  	return group;
>>>>  }
>>>>   
>>>
>>> The premise here is flawed.  An IOMMU group based at the root bus
>>> doesn't necessarily imply a lack of ACS.  There are devices on root
>>> buses, integrated endpoints and root ports.  Naturally an IOMMU group
>>> for these devices needs to be based at the root bus.  Additionally,
>>> there can be IOMMU groups developed around a lack of ACS that don't
>>> intersect with the root bus.  Since this is a warn_once, the false
>>> positives for root bus devices are going to be enumerated first.  On an
>>> Intel system there's typically a device as 00.0 that will always be
>>> pointlessly listed first.  Also, it's not clear that grouping devices
>>> together is always wrong, as Robin pointed out in the EHCI/OHCI
>>> example.  Lack of ACS on downtream ports is likely to cause problems,
>>> especially if that downstream port exposes a slot.  Maybe that would be
>>> a good place to start.  Also, what is someone supposed to do when they
>>> see this error?  If we can hope they'll look for the error in the code
>>> (unlikely) a big comment with useful external links might be
>>> necessary.  Based on how easily vendors ignore kernel warnings, I'm
>>> dubious there's any value to this path though.  Thanks,  
>>
>> Maybe, a better solution would be to add some sentences into vfio.txt 
>> documentation.
>>
>> I'm ready to drop this patch. I just don't want ACS requirement to be 
>> hidden between the source code.
>>
>> Would you be willing to do that?
>>
>> I know I read all pci and vfio documents in the past. I could have 
>> captured this requirement if it was there.
> 
> We already have this:
> 
> Documentation/vfio.txt:
> ...
> This isolation is not always at the granularity of a single device
> though.  Even when an IOMMU is capable of this, properties of devices,
> interconnects, and IOMMU topologies can each reduce this isolation.
> For instance, an individual device may be part of a larger multi-
> function enclosure.  While the IOMMU may be able to distinguish
> between devices within the enclosure, the enclosure may not require
> transactions between devices to reach the IOMMU.  Examples of this
> could be anything from a multi-function PCI device with backdoors
> between functions to a non-PCI-ACS (Access Control Services) capable
> bridge allowing redirection without reaching the IOMMU.  Topology
> can also play a factor in terms of hiding devices.  A PCIe-to-PCI
> bridge masks the devices behind it, making transaction appear as if
> from the bridge itself.  Obviously IOMMU design plays a major factor
> as well.
> ...
> 
> Additionally if you google for "iommu group", this is the first hit:
> 
> http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html
> 
> This talks extensively about ACS.  A few hits below that you can find a
> presentation I've given with ACS examples.  What additional
> documentation do you think would have helped you discover or understand
> this problem earlier?
> 
> I agree that device isolation is not a spec requirement.  The specs
> give us the tools that we need, but valid uses cases exist where a lack
> of isolation may be preferred.  If we logically deduce how we can give
> a device or set of devices to a user for an untrusted environment,
> isolated from other devices, I think it's pretty logical to come to the
> conclusion that ACS is the only way that PCIe hardware can allow that
> sort of control in a standard way.  Clearly we also recognize that this
> is a commonly overlooked area where hardware vendors may fail to
> incorporate this subtly into their platform design guidelines and thus
> we have numerous quirks for exposing virtual ACS-like isolation.  The
> hope is that adding a quirk for this devices means that feedback was
> provided to the hardware teams and system architects within the
> companies developing these devices to consider this use case and
> implement native ACS in the next generation.  Thanks,


I see, Maybe I was not familiar with ACS to understand these words
by the time I read it.


> 
> Alex
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.



More information about the linux-arm-kernel mailing list