[PATCH v7 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()

Jason Gunthorpe jgg at nvidia.com
Tue Dec 16 05:56:13 PST 2025


On Tue, Dec 16, 2025 at 10:09:26AM +0100, Peter Zijlstra wrote:
> Anyway, if I understand the above correctly, the smb_mb() is for:
> 
>   arm_smmu_domain_inv_range() 		arm_smmu_install_new_domain_invs()
> 
>     [W] IOPTE				  [Wrel] smmu_domain->invs
>     smp_mb()				  smp_mb()
>     [Lacq] smmu_domain->invs		  [L] IOPTE
> 
> Right? But I'm not sure about your 'HW sees the new IOPTEs' claim;

Yes, the '[L] IOPTE' would be a DMA from HW.

> that very much depend on what coherency domain the relevant hardware
> plays in. For smp_mb() to work, the hardware must be in the ISH
> domain, while typically devices are (if I remember my arrrrgh64
> correctly) in the OSH.

The '[W] IOPTE' sequence already includes a cache flush if the
inner/outer sharable are not coherent. If a cache flush was required
then the smp_mb() must also order it, otherwise it just has to order
the store.

The page table table code has always relied on this kind of ordering
with respect to DMA working, it would be completely broken if the DMA
does not order with the barriers.

For example:

            CPU0                         CPU1
   store PMD
                                        read PMD
   store PTE 1                          store PTE 2
   	     				dma memory barrier
                                        device reads 2
   dma memory barrier
   device reads 1


The 'device reads 2' thread must be guarenteed that the HW DMA
observes the PMD stored by CPU0. It relies on the same kind of
explicit cache flushing and barriers as this patch does.

Jason



More information about the linux-arm-kernel mailing list