[PATCH v3] iommu: arm-smmu: Set SMTNMB_TLBEN in ACR to enable caching of bypass entries

Robin Murphy robin.murphy at arm.com
Fri Nov 4 07:19:53 PDT 2016


On 04/11/16 09:55, Nipun Gupta wrote:
> The SMTNMB_TLBEN in the Auxiliary Configuration Register (ACR) provides an
> option to enable the updation of TLB in case of bypass transactions due to
> no stream match in the stream match table. This reduces the latencies of
> the subsequent transactions with the same stream-id which bypasses the SMMU.
> This provides a significant performance benefit for certain networking
> workloads.
> 
> With this change substantial performance improvement of ~9% is observed with
> DPDK l3fwd application (http://dpdk.org/doc/guides/sample_app_ug/l3_forward.html)
> on NXP's LS2088a platform.

Reviewed-by: Robin Murphy <robin.murphy at arm.com>

> Signed-off-by: Nipun Gupta <nipun.gupta at nxp.com>
> ---
> Changes for v2:
>     - Incorporated Robin's comments on v1 related to
> 	Setting SMTNMB_TLBEN in ACR only for MMU-500 as ACR is implementation dependent
> 	Code comments and Naming convention
> Changes for v3:
>     - Added correct patch version
> 
>  drivers/iommu/arm-smmu.c | 25 ++++++++++++++++---------
>  1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index ce2a9d4..05901be 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -247,6 +247,7 @@ enum arm_smmu_s2cr_privcfg {
>  #define ARM_MMU500_ACTLR_CPRE		(1 << 1)
>  
>  #define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
> +#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
>  
>  #define CB_PAR_F			(1 << 0)
>  
> @@ -1569,16 +1570,22 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
>  	for (i = 0; i < smmu->num_mapping_groups; ++i)
>  		arm_smmu_write_sme(smmu, i);
>  
> -	/*
> -	 * Before clearing ARM_MMU500_ACTLR_CPRE, need to
> -	 * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
> -	 * bit is only present in MMU-500r2 onwards.
> -	 */
> -	reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7);
> -	major = (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK;
> -	if ((smmu->model == ARM_MMU500) && (major >= 2)) {
> +	if (smmu->model == ARM_MMU500) {
> +		/*
> +		 * Before clearing ARM_MMU500_ACTLR_CPRE, need to
> +		 * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK
> +		 * bit is only present in MMU-500r2 onwards.
> +		 */
> +		reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7);
> +		major = (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK;
>  		reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_sACR);
> -		reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
> +		if (major >= 2)
> +			reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
> +		/*
> +		 * Allow unmatched Stream IDs to allocate bypass
> +		 * TLB entries for reduced latency.
> +		 */
> +		reg |= ARM_MMU500_ACR_SMTNMB_TLBEN;
>  		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_sACR);
>  	}
>  
> 




More information about the linux-arm-kernel mailing list