[PATCH v8 3/5] iommu/of: Add msi address regions reservation helper

Robin Murphy robin.murphy at arm.com
Thu Oct 5 05:55:10 PDT 2017


On 05/10/17 12:57, Will Deacon wrote:
> On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry at huawei.com>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through device
>>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry at huawei.com>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi at huawei.com>
>>>>> ---
>>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>  2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <linux/iommu.h>
>>>>>  #include <linux/limits.h>
>>>>>  #include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>>  #include <linux/of_iommu.h>
>>>>>  #include <linux/of_pci.h>
>>>>>  #include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>>  }
>>>>>  
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +	u32 *rid = data;
>>>>> +
>>>>> +	*rid = alias;
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>>  struct of_pci_iommu_alias_info {
>>>>>  	struct device *dev;
>>>>>  	struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>>  }
>>>>>  
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +				      struct list_head *head)
>>>>> +{
>>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +	struct iommu_resv_region *region;
>>>>> +	struct resource res;
>>>>> +	int err;
>>>>> +
>>>>> +	err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>>   pci {
>>   	...
>>   	iommu-map = <0 0 &iommu1 0x10000>;
>>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>   				<0x34560000 0x1000 IOMMU_MSI>;
>>   };
>>
>>   display {
>>   	...
>>   	iommus = <&iommu1 0x20000>;
>>   	/* simplefb region */
>>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>   };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>>   iommu1: iommu at acbd0000 {
>>   	...
>>   	#iommu-cells = <1>;
>>
>>   	iommu-reserved-ranges {
>>   		#address-cells = <1>;
>>   		#size-cells = <1>;
>>
>> 		its0_resv: msi at 12340000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x12340000 0x1000>;
>>   		};
>>
>> 		its1_resv: msi at 34560000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x34560000 0x1000>;
>>   		};
>>
>> 		fb_resv: direct at 12340000 {
>>   			compatible = "iommu-direct-region";
>>   			reg = <0x80080000 0x80000>;
>>   		};
>>   	};
>>   };
> 
> I like the locality of this, but is it definitely flexible enough? Do we
> need to deal with systems where the reserved regions are specific to the
> master (i.e. TBU) as opposed to the entire SMMU block?

That would certainly be true for most direct-mapping cases, where the
reservation is tied to the specific stream ID(s) of one master, let
alone a TBU. I guess we'd have to make a phandle reference from the
device node(s) to the region node mandatory, such that software need
only make the actual reservation/mapping upon encountering a device that
actually needs it.

Robin.



More information about the linux-arm-kernel mailing list