[PATCH] iommu/dma: Stop getting dma_32bit_pfn wrong
Robin Murphy
robin.murphy at arm.com
Tue Nov 15 08:22:39 PST 2016
On 15/11/16 11:49, Joerg Roedel wrote:
> On Fri, Nov 11, 2016 at 06:30:45PM +0000, Robin Murphy wrote:
>> iommu_dma_init_domain() was originally written under the misconception
>> that dma_32bit_pfn represented some sort of size limit for IOVA domains.
>> Since the truth is almost the exact opposite of that, rework the logic
>> and comments to reflect its real purpose of optimising lookups when
>> allocating from a subset of the available space.
>>
>> Signed-off-by: Robin Murphy <robin.murphy at arm.com>
>> ---
>> drivers/iommu/dma-iommu.c | 23 ++++++++++++++++++-----
>> 1 file changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index c5ab8667e6f2..ae045a14b530 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -139,6 +139,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>> {
>> struct iova_domain *iovad = cookie_iovad(domain);
>> unsigned long order, base_pfn, end_pfn;
>> + bool pci = dev && dev_is_pci(dev);
>>
>> if (!iovad)
>> return -ENODEV;
>> @@ -161,19 +162,31 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>> end_pfn = min_t(unsigned long, end_pfn,
>> domain->geometry.aperture_end >> order);
>> }
>> + /*
>> + * PCI devices may have larger DMA masks, but still prefer allocating
>> + * within a 32-bit mask to avoid DAC addressing. Such limitations don't
>> + * apply to the typical platform device, so for those we may as well
>> + * leave the cache limit at the top of the range they're likely to use.
>> + */
>> + if (pci)
>> + end_pfn = min_t(unsigned long, end_pfn,
>> + DMA_BIT_MASK(32) >> order);
>
> Question, does it actually hurt platform devices to follow the same
> allocation strategy as pci devices? I mean, does it hurt enough to
> special-case the code here?
It hurts them in the sense that they get absolutely no benefit from
trying a lower limit first (the device simply has as many address lines
wired to the interconnect as it needs/can drive), therefore there's
nothing to offset the cost of every allocation becoming
try-fail-and-try-again once <32-bits fills up with, say, big GPU
mappings. The maintenance cost of a couple of one-liner if statements
doesn't seem big enough to prevent a significant set of devices - bear
in mind that the first products really using this code in anger don't
have PCI at all (MT8173) - from having straightforward and optimal
allocation behaviour all the time, every time.
I suppose it might bear saying that as a Cambridge Water customer, I do
tend to view PCI in the same light as USB/I2C/etc. as "just another
external bus controller on the outside edge of 'the system'". From that
perspective, avoiding DAC overhead really does look like the minority
special case, although I appreciate that the view from the x86 angle is
rather different. I also suspect that, whether we want to or not, we may
continue to see more on-chip 'PCI' devices which don't really need this
either (because they're really just 'platform' devices talking directly
to whatever coherent interconnect with a config space bolted on somewhere).
Robin.
More information about the linux-arm-kernel
mailing list