[RFC PATCH v3 09/12] iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()

Wed Apr 7 10:57:06 BST 2021

On Tue, Apr 06, 2021 at 02:02:26PM -0700, isaacm at codeaurora.org wrote:
> On 2021-04-06 05:15, Will Deacon wrote:
> > On Mon, Apr 05, 2021 at 12:11:09PM -0700, Isaac J. Manjarres wrote:
> > > Implement the unmap_pages() callback for the ARM LPAE io-pgtable
> > > format.
> > > 
> > > Signed-off-by: Isaac J. Manjarres <isaacm at codeaurora.org>
> > > Suggested-by: Will Deacon <will at kernel.org>
> > > ---
> > >  drivers/iommu/io-pgtable-arm.c | 124
> > > +++++++++++++++++++++++++++------
> > >  1 file changed, 104 insertions(+), 20 deletions(-)
> > 
> > [...]
> > 
> > > +static size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops,
> > > unsigned long iova,
> > > +				   size_t pgsize, size_t pgcount,
> > > +				   struct iommu_iotlb_gather *gather)
> > > +{
> > > +	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> > > +	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> > > +	arm_lpae_iopte *ptep = data->pgd;
> > > +	long iaext = (s64)iova >> cfg->ias;
> > > +	size_t unmapped = 0, unmapped_page;
> > > +	int last_lvl;
> > > +	size_t table_size, pages, tbl_offset, max_entries;
> > > +
> > > +	if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize ||
> > > !pgcount))
> > > +		return 0;
> > > +
> > > +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
> > > +		iaext = ~iaext;
> > > +	if (WARN_ON(iaext))
> > > +		return 0;
> > > +
> > > +	/*
> > > +	 * Calculating the page table size here helps avoid situations where
> > > +	 * a page range that is being unmapped may be mapped at the same
> > > level
> > > +	 * but not mapped by the same tables. Allowing such a scenario to
> > > +	 * occur can complicate the logic in arm_lpae_split_blk_unmap().
> > > +	 */
> > > +	last_lvl = ARM_LPAE_BLOCK_SIZE_LVL(pgsize, data);
> > > +
> > > +	if (last_lvl == data->start_level)
> > > +		table_size = ARM_LPAE_PGD_SIZE(data);
> > > +	else
> > > +		table_size = ARM_LPAE_GRANULE(data);
> > > +
> > > +	max_entries = table_size / sizeof(*ptep);
> > > +
> > > +	while (pgcount) {
> > > +		tbl_offset = ARM_LPAE_LVL_IDX(iova, last_lvl, data);
> > > +		pages = min_t(size_t, pgcount, max_entries - tbl_offset);
> > > +		unmapped_page = __arm_lpae_unmap(data, gather, iova, pgsize,
> > > +						 pages, data->start_level,
> > > +						 ptep);
> > > +		if (!unmapped_page)
> > > +			break;
> > > +
> > > +		unmapped += unmapped_page;
> > > +		iova += unmapped_page;
> > > +		pgcount -= pages;
> > > +	}
> > 
> > Robin has some comments on the first version of this patch, and I
> > don't think you
> > addressed them:
> > 
> > https://lore.kernel.org/r/b93fa0b1-e2a4-1aad-8b88-4d0dfecdfef7@arm.com
> > 
> > I'm inclined to agree that iterating here doesn't make a lot of sense --
> > we've already come back out of the page-table walk, so I think we should
> > just return to the caller (who is well prepared to handle a partial
> > unmap).
> > Same for the map side of things.
> > 
> > If we get numbers showing that this is causing a performance issue, then
> > we should rework the page-table code to handle this at the lower level
> > (because I doubt the loop that you have is really worse than returning
> > to
> > the caller anyway).
> > 
> Sorry, I seem to have overlooked those comments.
> 
> I will go ahead and address them. I think it might be ideal to try to do
> as much work as possible in the io-pgtable level, so as to minimize the
> number
> of indirect calls incurred by jumping back and forth between iommu fwk,
> iommu
> driver, and io-pgtable code.
> 
> Perhaps we can try something like how the linear mappings are created on
> arm64 i.e.
> on the previous level, we can determine how many pages can be unmapped in
> one page table
> in one iteration, and on the subsequent iterations, we can tackle another
> page table at
> the lower level. Looking at the code, it doesn't seem too difficult to add
> this in. Thoughts?

I don't object to getting there eventually, but as an initial step I think
returning back to the caller is perfectly reasonable and will probably make
the patches easier to review. In other words, implement the simple (correct)
thing first, and then have subsequent patches to improve it.

Will