[PATCH] iommu/io-pgtable-arm: Remove split on unmap behavior

Jason Gunthorpe jgg at nvidia.com
Fri Nov 1 08:37:50 PDT 2024


On Fri, Nov 01, 2024 at 11:58:29AM +0000, Will Deacon wrote:
> On Fri, Oct 18, 2024 at 02:19:26PM -0300, Jason Gunthorpe wrote:
> > Of the page table implementations (AMD v1/2, VT-D SS, ARM32, DART)
> > arm_lpae is unique in how it handles partial unmap of large IOPTEs.
> > 
> > All other drivers will unmap the large IOPTE and return it's length.  For
> > example if a 2M IOPTE is present and the first 4K is requested to be
> > unmapped then unmap will remove the whole 2M and report 2M as the result.
> > 
> > arm_lpae instead replaces the IOPTE with a table of smaller IOPTEs, unmaps
> > the 4K and returns 4k. This is actually an illegal/non-hitless operation
> > on at least SMMUv3 because of the BBM level 0 rules.
> > 
> > Long ago VFIO could trigger a path like this, today I know of no user of
> > this functionality.
> > 
> > Given it doesn't work fully correctly on SMMUv3 and would create
> > portability problems if any user depends on it, remove the unique support
> > in arm_lpae and align with the expected iommu interface.
> > 
> > Outside the iommu users, this will potentially effect io_pgtable users of
> > ARM_32_LPAE_S1, ARM_32_LPAE_S2, ARM_64_LPAE_S1, ARM_64_LPAE_S2, and
> > ARM_MALI_LPAE formats.
> > 
> > Cc: Boris Brezillon <boris.brezillon at collabora.com>
> > Cc: Steven Price <steven.price at arm.com>
> > Cc: Liviu Dudau <liviu.dudau at arm.com>
> > Cc: dri-devel at lists.freedesktop.org
> > Signed-off-by: Jason Gunthorpe <jgg at nvidia.com>
> > ---
> >  drivers/iommu/io-pgtable-arm.c | 72 +++-------------------------------
> >  1 file changed, 6 insertions(+), 66 deletions(-)
> > 
> > I don't know anything in the iommu space that needs this, and this is the only
> > page table implementation in iommu that does it.
> 
> I think the v7s code does it as well, so please can you apply the same
> treatment to arm_v7s_split_blk_unmap()?

I have that patch written, I'm not as confident in it as it is much
more complex, but it passes my simple tests.

However, if we make it fail and WARN_ON that should simplify it alot.

> > @@ -678,12 +618,12 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
> >  
> >  		return i * size;
> >  	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
> > -		/*
> > -		 * Insert a table at the next level to map the old region,
> > -		 * minus the part we want to unmap
> > -		 */
> > -		return arm_lpae_split_blk_unmap(data, gather, iova, size, pte,
> > -						lvl + 1, ptep, pgcount);
> > +		/* Unmap the entire large IOPTE and return its size */
> > +		size = ARM_LPAE_BLOCK_SIZE(lvl, data);
> 
> If I understand your other message correctly, we shouldn't actually get
> into this situation any more, right? In which case, can we WARN_ONCE()
> and return 0 instead? Over-unmapping is filthy!

VFIO won't do it (except on AMD), I have not tried to figure out if
something else might depend on it over-unmapping.

So, OK, let's try the WARN_ON and it is very easy to put the above
hunk back as a fixup if someone hits it.

Jason



More information about the linux-arm-kernel mailing list