[PATCH v3] iommu: arm-smmu: Set SMTNMB_TLBEN in ACR to enable caching of bypass entries
Robin Murphy
robin.murphy at arm.com
Fri Nov 4 07:19:53 PDT 2016
On 04/11/16 09:55, Nipun Gupta wrote:
> The SMTNMB_TLBEN in the Auxiliary Configuration Register (ACR) provides an
> option to enable the updation of TLB in case of bypass transactions due to
> no stream match in the stream match table. This reduces the latencies of
> the subsequent transactions with the same stream-id which bypasses the SMMU.
> This provides a significant performance benefit for certain networking
> workloads.
>
> With this change substantial performance improvement of ~9% is observed with
> DPDK l3fwd application (http://dpdk.org/doc/guides/sample_app_ug/l3_forward.html)
> on NXP's LS2088a platform.
Reviewed-by: Robin Murphy <robin.murphy at arm.com>
> Signed-off-by: Nipun Gupta <nipun.gupta at nxp.com>
> ---
> Changes for v2:
> - Incorporated Robin's comments on v1 related to
> Setting SMTNMB_TLBEN in ACR only for MMU-500 as ACR is implementation dependent
> Code comments and Naming convention
> Changes for v3:
> - Added correct patch version
>
> drivers/iommu/arm-smmu.c | 25 ++++++++++++++++---------
> 1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index ce2a9d4..05901be 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -247,6 +247,7 @@ enum arm_smmu_s2cr_privcfg {
> #define ARM_MMU500_ACTLR_CPRE (1 << 1)
>
> #define ARM_MMU500_ACR_CACHE_LOCK (1 << 26)
> +#define ARM_MMU500_ACR_SMTNMB_TLBEN (1 << 8)
>
> #define CB_PAR_F (1 << 0)
>
> @@ -1569,16 +1570,22 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
> for (i = 0; i < smmu->num_mapping_groups; ++i)
> arm_smmu_write_sme(smmu, i);
>
> - /*
> - * Before clearing ARM_MMU500_ACTLR_CPRE, need to
> - * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
> - * bit is only present in MMU-500r2 onwards.
> - */
> - reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7);
> - major = (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK;
> - if ((smmu->model == ARM_MMU500) && (major >= 2)) {
> + if (smmu->model == ARM_MMU500) {
> + /*
> + * Before clearing ARM_MMU500_ACTLR_CPRE, need to
> + * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
> + * bit is only present in MMU-500r2 onwards.
> + */
> + reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7);
> + major = (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK;
> reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_sACR);
> - reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
> + if (major >= 2)
> + reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
> + /*
> + * Allow unmatched Stream IDs to allocate bypass
> + * TLB entries for reduced latency.
> + */
> + reg |= ARM_MMU500_ACR_SMTNMB_TLBEN;
> writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_sACR);
> }
>
>
More information about the linux-arm-kernel
mailing list