[PATCH v2 1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE

Steven Price steven.price at arm.com
Mon Oct 5 11:16:32 EDT 2020


On 05/10/2020 15:50, Boris Brezillon wrote:
> On Tue, 22 Sep 2020 15:16:48 +0100
> Robin Murphy <robin.murphy at arm.com> wrote:
> 
>> Midgard GPUs have ACE-Lite master interfaces which allows systems to
>> integrate them in an I/O-coherent manner. It seems that from the GPU's
>> viewpoint, the rest of the system is its outer shareable domain, and so
>> even when snoop signals are wired up, they are only emitted for outer
>> shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
>> indeed get coherent pagetable walks working nicely for the coherent
>> T620 in the Arm Juno SoC.
>>
>> Reviewed-by: Steven Price <steven.price at arm.com>
>> Tested-by: Neil Armstrong <narmstrong at baylibre.com>
>> Signed-off-by: Robin Murphy <robin.murphy at arm.com>
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index dc7bcf858b6d..b4072a18e45d 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -440,7 +440,13 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>>   				<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
>>   	}
>>   
>> -	if (prot & IOMMU_CACHE)
>> +	/*
>> +	 * Also Mali has its own notions of shareability wherein its Inner
>> +	 * domain covers the cores within the GPU, and its Outer domain is
>> +	 * "outside the GPU" (i.e. either the Inner or System domain in CPU
>> +	 * terms, depending on coherency).
>> +	 */
>> +	if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
>>   		pte |= ARM_LPAE_PTE_SH_IS;
>>   	else
>>   		pte |= ARM_LPAE_PTE_SH_OS;
> 
> Actually, it still doesn't work on s922x :-/. For it to work I
> correctly, I need to drop the outer shareable flag here.

The logic here does seem a bit odd. Originally it was:

IOMMU_CACHE -> Inner shared (value 3)
!IOMMU_CACHE -> Outer shared (value 2)

For Mali we're forcing everything to the second option. But Mali being 
Mali doesn't do things the same as LPAE, so for Mali we have:

0 - not shared
1 - reserved
2 - inner(*) and outer shareable
3 - inner shareable only

(*) where "inner" means internal to the GPU, and "outer" means shared 
with the CPU "inner". Very confusing!

So originally we had:
IOMMU_CACHE -> not shared with CPU (only internally in the GPU)
!IOMMU_CACHE -> shared with CPU

The change above gets us to "always shared", dropping the SH_OS bit 
would get us to not even shareable between cores (which doesn't sound 
like what we want).

It's not at all clear to me why the change helps, but I suspect we want 
at least "inner" shareable.

Steve

>> @@ -1049,6 +1055,9 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
>>   	cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) |
>>   					  ARM_MALI_LPAE_TTBR_READ_INNER |
>>   					  ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
>> +	if (cfg->coherent_walk)
>> +		cfg->arm_mali_lpae_cfg.transtab |= ARM_MALI_LPAE_TTBR_SHARE_OUTER;
>> +
>>   	return &data->iop;
>>   
>>   out_free_data:
> 




More information about the linux-amlogic mailing list