[PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
Alex Williamson
alex.williamson at redhat.com
Tue Apr 5 12:50:36 PDT 2022
On Tue, 5 Apr 2022 13:16:02 -0300
Jason Gunthorpe <jgg at nvidia.com> wrote:
> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
>
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
>
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
>
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
>
> Only VFIO will call this new API.
>
> Signed-off-by: Jason Gunthorpe <jgg at nvidia.com>
> ---
> drivers/iommu/amd/iommu.c | 7 +++++++
> drivers/iommu/intel/iommu.c | 14 +++++++++++++-
> include/linux/intel-iommu.h | 1 +
> include/linux/iommu.h | 4 ++++
> 4 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
> return 0;
> }
>
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> + /* IOMMU_PTE_FC is always set */
> + return true;
> +}
> +
> const struct iommu_ops amd_iommu_ops = {
> .capable = amd_iommu_capable,
> .domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
> .flush_iotlb_all = amd_iommu_flush_iotlb_all,
> .iotlb_sync = amd_iommu_iotlb_sync,
> .free = amd_iommu_domain_free,
> + .enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
> }
> };
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
> prot |= DMA_PTE_READ;
> if (iommu_prot & IOMMU_WRITE)
> prot |= DMA_PTE_WRITE;
> - if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> + if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> + dmar_domain->enforce_no_snoop)
> prot |= DMA_PTE_SNP;
>
> max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
> return phys;
> }
>
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> + if (!dmar_domain->iommu_snooping)
> + return false;
> + dmar_domain->enforce_no_snoop = true;
> + return true;
> +}
Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults? It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.
There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced". For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.
If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?
Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes? Thanks,
Alex
> +
> static bool intel_iommu_capable(enum iommu_cap cap)
> {
> if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
> .iotlb_sync = intel_iommu_tlb_sync,
> .iova_to_phys = intel_iommu_iova_to_phys,
> .free = intel_iommu_domain_free,
> + .enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
> }
> };
>
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -540,6 +540,7 @@ struct dmar_domain {
> u8 has_iotlb_device: 1;
> u8 iommu_coherency: 1; /* indicate coherency of iommu access */
> u8 iommu_snooping: 1; /* indicate snooping control feature */
> + u8 enforce_no_snoop : 1; /* Create IOPTEs with snoop control */
>
> struct list_head devices; /* all devices' list */
> struct iova_domain iovad; /* iova's that belong to this domain */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 9208eca4b0d1ac..fe4f24c469c373 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -272,6 +272,9 @@ struct iommu_ops {
> * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
> * queue
> * @iova_to_phys: translate iova to physical address
> + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> + * including no-snoop TLPs on PCIe or other platform
> + * specific mechanisms.
> * @enable_nesting: Enable nesting
> * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
> * @free: Release the domain after use.
> @@ -300,6 +303,7 @@ struct iommu_domain_ops {
> phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
> dma_addr_t iova);
>
> + bool (*enforce_cache_coherency)(struct iommu_domain *domain);
> int (*enable_nesting)(struct iommu_domain *domain);
> int (*set_pgtable_quirks)(struct iommu_domain *domain,
> unsigned long quirks);
More information about the linux-arm-kernel
mailing list