[PATCH] iommu: Always fill in gather when unmapping
Jason Gunthorpe
jgg at nvidia.com
Wed Apr 1 10:36:50 PDT 2026
On Wed, Apr 01, 2026 at 05:33:28PM +0100, Robin Murphy wrote:
> > io-pgtable might have intended to allow the driver to choose between
> > gather or immediate flush because it passed gather to
> > ops->tlb_add_page(), however no driver does anything with it.
>
> Apart from arm-smmu-v3...
Bah, I did my research on the wrong tree and missed this.
> > mtk uses io-pgtable-arm-v7s but added the range to the gather in the
> > unmap callback. Move this into the io-pgtable-arm unmap itself. That
> > will fix all the armv7 using drivers (arm-smmu, qcom_iommu,
> > ipmmu-vmsa).
>
> io-pgtable-arm-v7s != io-pgtable-arm. You're *breaking* MTK (and failing
> to fix the other v7s user, which is MSM).
I was very confused what you were talking about, but I see now that
the hunk adding iommu_iotlb_gather_add_range() to v7 got lost somehow!
@@ -596,6 +596,9 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
__arm_v7s_set_pte(ptep, 0, num_entries, &iop->cfg);
+ if (!iommu_iotlb_gather_queued(gather))
+ iommu_iotlb_gather_add_range(gather, iova, size);
+
for (i = 0; i < num_entries; i++) {
if (ARM_V7S_PTE_IS_TABLE(pte[i], lvl)) {
/* Also flush any partial walks */
> > arm-smmu uses both ARM_V7S and ARM LPAE formats. The LPAE formats
> > already have the gather population because SMMUv3 requires it, so it
> > becomes consistent.
>
> Huh? arm-smmu-v3 invokes iommu_iotlb_gather_add_page() itself, because
> arm-smmu-v3 uses gathers
Yeah, I missed this whole bit, it needs some changes.
> Invoking add range before add_page will end up defeating the
> iommu_iotlb_gather_is_disjoint() check and making SMMUv3
> overinvalidate between disjoint ranges.
Right, that flow needs fixing.
> I guess now I remember why we weren't validating gathers in core code
> before :(
My point is not filling the gather is a micro-optimization that
benefits a few drivers. I think it is so small compared to an IOTLB
flush that it isn't worth worrying about.
So, I'd like to make everything the same and populate the gather
correctly in all flows. I'll fix the SMMUv3 thing and lets look again,
this patch is not so scary to make me think we shouldn't do that.
> @@ -2714,6 +2714,10 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
> pr_debug("unmapped: iova 0x%lx size 0x%zx\n",
> iova, unmapped_page);
> + /* If the driver itself isn't using the gather, mark it used */
> + if (iotlb_gather->end <= iotlb_gather->start)
> + iommu_iotlb_gather_add_range(&iotlb_gather, iova, unmapped_page);
The gathers can be joined across unmaps and now we are inviting subtly
ill-formed gathers as only the first unmap will get included.
We do have error cases where the gather is legitimately empty, and
this would squash that, it probably needs to check unmapped_page for 0
too, at least.
Thanks,
Jason
More information about the linux-arm-kernel
mailing list