[PATCH] iommu/dma: Add config for PCI SAC address trick

John Garry john.garry at huawei.com
Thu May 19 03:20:40 PDT 2022


On 18/05/2022 18:36, Robin Murphy wrote:
> For devices stuck behind a conventional PCI bus, saving extra cycles at
> 33MHz is probably fairly significant. However since native PCI Express
> is now the norm for high-performance devices, the optimisation to always
> prefer 32-bit addresses for the sake of avoiding DAC is starting to look
> rather anachronistic. Technically 32-bit addresses do have shorter TLPs
> on PCIe, but unless the device is saturating its link bandwidth with
> small transfers it seems unlikely that the difference is appreciable.
> 
> What definitely is appreciable, however, is that the IOVA allocator
> doesn't behave all that well once the 32-bit space starts getting full.
> As DMA working sets get bigger, this optimisation increasingly backfires
> and adds considerable overhead to the dma_map path for use-cases like
> high-bandwidth networking. We've increasingly bandaged the allocator
> in attempts to mitigate this, but it remains fundamentally at odds with
> other valid requirements to try as hard as possible to satisfy a request
> within the given limit; what we really need is to just avoid this odd
> notion of a speculative allocation when it isn't beneficial anyway.
> 
> Unfortunately that's where things get awkward... Having been present on
> x86 for 15 years or so now, it turns out there are systems which fail to
> properly define the upper limit of usable IOVA space for certain devices
> and this trick was the only thing letting them work OK. I had a similar
> ulterior motive for a couple of early arm64 systems when originally
> adding it to iommu-dma, but those really should now be fixed with proper
> firmware bindings, and other arm64 users really need it out of the way,
> so let's just leave it default-on for x86.
> 
> Signed-off-by: Robin Murphy <robin.murphy at arm.com>
> ---
>   drivers/iommu/Kconfig     | 24 ++++++++++++++++++++++++
>   drivers/iommu/dma-iommu.c |  2 +-

It might be worth printing this default value always and not just for 
when it is set from commandline, like what we do for default domain type 
and IOTLB invalidation policy

>   2 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c79a0df090c0..bf9b295f1c89 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -144,6 +144,30 @@ config IOMMU_DMA
>   	select IRQ_MSI_IOMMU
>   	select NEED_SG_DMA_LENGTH
>   
> +config IOMMU_DMA_PCI_SAC_OPT
> +	bool "Enable 64-bit legacy PCI optimisation by default"
> +	depends on IOMMU_DMA
> +	default X86

Do we have a strategy for if and when issues start popping up on other 
architectures? Is it to simply tell them to just turn this flag on (and 
also fix your platform)?

> +	help
> +	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
> +	  wherein the DMA API layer will always first try to allocate a 32-bit
> +	  DMA address suitable for a single address cycle, before falling back
> +	  to allocating from the full usable address range. If your system has
> +	  64-bit legacy PCI devices in 32-bit slots where using dual address
> +	  cycles reduces DMA throughput significantly, this optimisation may be
> +	  beneficial to overall performance.
> +
> +	  If you have a modern PCI Express based system, it should usually be
> +	  safe to say "n" here and avoid the potential extra allocation overhead.
> +	  However, beware that this optimisation has also historically papered
> +	  over bugs where the IOMMU address range above 32 bits is not fully
> +	  usable. If device DMA problems and/or IOMMU faults start occurring
> +	  with IOMMU translation enabled after disabling this option, it is
> +	  likely a sign of a latent driver or firmware/BIOS bug.
> +
> +	  If this option is not set, the optimisation can be enabled at
> +	  boot time with the "iommu.forcedac=0" command-line argument.
> +

Thanks,
John



More information about the linux-arm-kernel mailing list