[Patch v1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Tue Apr 19 22:32:10 PDT 2022


On 4/20/2022 1:57 AM, Robin Murphy wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022-04-17 10:04, Ashish Mhetre wrote:
>> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
>> entries to not be invalidated correctly. The problem is that the walk
>> cache index generated for IOVA is not same across translation and
>> invalidation requests. This is leading to page faults when PMD entry is
>> released during unmap and populated with new PTE table during subsequent
>> map request. Disabling large page mappings avoids the release of PMD
>> entry and avoid translations seeing stale PMD entry in walk cache.
>> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
>> Tegra234 devices. This is recommended fix from Tegra hardware design
>> team.
> 
> Is this related to any of the several known MMU-500 invalidation errata,
> or is it definitely specific to something NVIDIA have done with their
> integration?
> 
It's not a known MMU-500 errata. It is specific to NVIDIA.

>> Co-developed-by: Pritesh Raithatha <praithatha at nvidia.com>
>> Signed-off-by: Pritesh Raithatha <praithatha at nvidia.com>
>> Signed-off-by: Ashish Mhetre <amhetre at nvidia.com>
>> ---
>>   drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 23 ++++++++++++++++++++
>>   drivers/iommu/arm/arm-smmu/arm-smmu.c        |  3 +++
>>   drivers/iommu/arm/arm-smmu/arm-smmu.h        |  1 +
>>   3 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
>> index 01e9b50b10a1..b7a3d06da2f4 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
>> @@ -258,6 +258,27 @@ static void nvidia_smmu_probe_finalize(struct 
>> arm_smmu_device *smmu, struct devi
>>                       dev_name(dev), err);
>>   }
>>
>> +static void nvidia_smmu_cfg_pgsize_bitmap(struct arm_smmu_device *smmu)
>> +{
>> +     const struct device_node *np = smmu->dev->of_node;
>> +
>> +     /*
>> +      * Tegra194 and Tegra234 SoCs have the erratum that causes walk 
>> cache
>> +      * entries to not be invalidated correctly. The problem is that 
>> the walk
>> +      * cache index generated for IOVA is not same across translation 
>> and
>> +      * invalidation requests. This is leading to page faults when 
>> PMD entry
>> +      * is released during unmap and populated with new PTE table during
>> +      * subsequent map request. Disabling large page mappings avoids the
>> +      * release of PMD entry and avoid translations seeing stale PMD 
>> entry in
>> +      * walk cache.
>> +      * Fix this by limiting the page mappings to PAGE_SIZE on 
>> Tegra194 and
>> +      * Tegra234.
>> +      */
>> +     if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
>> +         of_device_is_compatible(np, "nvidia,tegra194-smmu"))
>> +             smmu->pgsize_bitmap = PAGE_SIZE;
>> +}
>> +
>>   static const struct arm_smmu_impl nvidia_smmu_impl = {
>>       .read_reg = nvidia_smmu_read_reg,
>>       .write_reg = nvidia_smmu_write_reg,
>> @@ -268,10 +289,12 @@ static const struct arm_smmu_impl 
>> nvidia_smmu_impl = {
>>       .global_fault = nvidia_smmu_global_fault,
>>       .context_fault = nvidia_smmu_context_fault,
>>       .probe_finalize = nvidia_smmu_probe_finalize,
>> +     .cfg_pgsize_bitmap = nvidia_smmu_cfg_pgsize_bitmap,
>>   };
>>
>>   static const struct arm_smmu_impl nvidia_smmu_single_impl = {
>>       .probe_finalize = nvidia_smmu_probe_finalize,
>> +     .cfg_pgsize_bitmap = nvidia_smmu_cfg_pgsize_bitmap,
>>   };
>>
>>   struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device 
>> *smmu)
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> index 568cce590ccc..3692a19a588a 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> @@ -1872,6 +1872,9 @@ static int arm_smmu_device_cfg_probe(struct 
>> arm_smmu_device *smmu)
>>       if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K)
>>               smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
>>
>> +     if (smmu->impl && smmu->impl->cfg_pgsize_bitmap)
>> +             smmu->impl->cfg_pgsize_bitmap(smmu);
> 
> I'm not the biggest fan of adding a super-specific hook for this, when
> it seems like it could just as easily be handled in the init_context
> hook, which is where it is precisely for the purpose of mangling the
> pgtable_cfg to influence io-pgtable's behaviour.
> 
Yes, we can use init_context() to override pgsize_bitmap. I'll update
that in next version.

> Thanks,
> Robin.
> 
>> +
>>       if (arm_smmu_ops.pgsize_bitmap == -1UL)
>>               arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
>>       else
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> index 2b9b42fb6f30..5d9b03024969 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> @@ -442,6 +442,7 @@ struct arm_smmu_impl {
>>       void (*write_s2cr)(struct arm_smmu_device *smmu, int idx);
>>       void (*write_sctlr)(struct arm_smmu_device *smmu, int idx, u32 
>> reg);
>>       void (*probe_finalize)(struct arm_smmu_device *smmu, struct 
>> device *dev);
>> +     void (*cfg_pgsize_bitmap)(struct arm_smmu_device *smmu);
>>   };
>>
>>   #define INVALID_SMENDX                      -1