[PATCH v2 1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE

Boris Brezillon boris.brezillon at collabora.com
Mon Oct 5 11:52:47 EDT 2020


On Mon, 5 Oct 2020 16:16:32 +0100
Steven Price <steven.price at arm.com> wrote:

> On 05/10/2020 15:50, Boris Brezillon wrote:
> > On Tue, 22 Sep 2020 15:16:48 +0100
> > Robin Murphy <robin.murphy at arm.com> wrote:
> >   
> >> Midgard GPUs have ACE-Lite master interfaces which allows systems to
> >> integrate them in an I/O-coherent manner. It seems that from the GPU's
> >> viewpoint, the rest of the system is its outer shareable domain, and so
> >> even when snoop signals are wired up, they are only emitted for outer
> >> shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
> >> indeed get coherent pagetable walks working nicely for the coherent
> >> T620 in the Arm Juno SoC.
> >>
> >> Reviewed-by: Steven Price <steven.price at arm.com>
> >> Tested-by: Neil Armstrong <narmstrong at baylibre.com>
> >> Signed-off-by: Robin Murphy <robin.murphy at arm.com>
> >> ---
> >>   drivers/iommu/io-pgtable-arm.c | 11 ++++++++++-
> >>   1 file changed, 10 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> >> index dc7bcf858b6d..b4072a18e45d 100644
> >> --- a/drivers/iommu/io-pgtable-arm.c
> >> +++ b/drivers/iommu/io-pgtable-arm.c
> >> @@ -440,7 +440,13 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> >>   				<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
> >>   	}
> >>   
> >> -	if (prot & IOMMU_CACHE)
> >> +	/*
> >> +	 * Also Mali has its own notions of shareability wherein its Inner
> >> +	 * domain covers the cores within the GPU, and its Outer domain is
> >> +	 * "outside the GPU" (i.e. either the Inner or System domain in CPU
> >> +	 * terms, depending on coherency).
> >> +	 */
> >> +	if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
> >>   		pte |= ARM_LPAE_PTE_SH_IS;
> >>   	else
> >>   		pte |= ARM_LPAE_PTE_SH_OS;  
> > 
> > Actually, it still doesn't work on s922x :-/. For it to work I
> > correctly, I need to drop the outer shareable flag here.  
> 
> The logic here does seem a bit odd. Originally it was:
> 
> IOMMU_CACHE -> Inner shared (value 3)
> !IOMMU_CACHE -> Outer shared (value 2)
> 
> For Mali we're forcing everything to the second option. But Mali being 
> Mali doesn't do things the same as LPAE, so for Mali we have:
> 
> 0 - not shared
> 1 - reserved
> 2 - inner(*) and outer shareable
> 3 - inner shareable only
> 
> (*) where "inner" means internal to the GPU, and "outer" means shared 
> with the CPU "inner". Very confusing!
> 
> So originally we had:
> IOMMU_CACHE -> not shared with CPU (only internally in the GPU)
> !IOMMU_CACHE -> shared with CPU
> 
> The change above gets us to "always shared", dropping the SH_OS bit 
> would get us to not even shareable between cores (which doesn't sound 
> like what we want).

Thanks for this explanation.

> 
> It's not at all clear to me why the change helps, but I suspect we want 
> at least "inner" shareable.

Right. Looks like all this was caused by a bad conflict resolution
during a rebase. Sorry for the noise :-/.



More information about the linux-amlogic mailing list