[PATCH v3] arm64: mte: optimize GCR_EL1 modification on kernel entry/exit

Will Deacon will at kernel.org
Tue Jul 13 08:52:04 PDT 2021


On Thu, Jul 08, 2021 at 06:49:41PM -0700, Peter Collingbourne wrote:
> Accessing GCR_EL1 and issuing an ISB can be expensive on some
> microarchitectures. Although we must write to GCR_EL1, we can
> restructure the code to avoid reading from it because the new value
> can be derived entirely from the exclusion mask, which is already in
> a GPR. Do so.
> 
> Furthermore, although an ISB is required in order to make this system
> register update effective, and the same is true for PAC-related updates
> to SCTLR_EL1 or APIAKey{Hi,Lo}_EL1, we issue two ISBs on machines
> that support both features while we only need to issue one. To avoid
> the unnecessary additional ISB, remove the ISBs from the PAC and
> MTE-specific alternative blocks and add an ISB in a separate block
> that is activated only if either feature is supported.

Sorry to be a pain, but can you split this into two patches, please? I
think you're making two distinct changes, and it would be easier to review
and discuss them separately (it would also be interesting to know the
relative performance improvement you get from them).

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index efed2830d141..740e09ade2ea 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1717,6 +1717,20 @@ static bool has_generic_auth(const struct arm64_cpu_capabilities *entry,
>  }
>  #endif /* CONFIG_ARM64_PTR_AUTH */
>  
> +static bool has_address_auth_or_mte(const struct arm64_cpu_capabilities *entry,
> +				    int scope)
> +{
> +#ifdef CONFIG_ARM64_PTR_AUTH
> +	if (has_address_auth_metacap(entry, scope))
> +		return true;
> +#endif
> +#ifdef CONFIG_ARM64_MTE
> +	if (__system_matches_cap(ARM64_MTE))
> +		return true;
> +#endif
> +	return false;
> +}
> +
>  #ifdef CONFIG_ARM64_E0PD
>  static void cpu_enable_e0pd(struct arm64_cpu_capabilities const *cap)
>  {
> @@ -2218,6 +2232,11 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.matches = has_cpuid_feature,
>  		.min_field_value = 1,
>  	},
> +	{
> +		.capability = ARM64_HAS_ADDRESS_AUTH_OR_MTE,
> +		.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
> +		.matches = has_address_auth_or_mte,
> +	},

I'd rather avoid adding a new cap for this, as these features are entirely
unrelated in the architecture and if we end up piling more combinations of
features in here in the future then I fear it will become quite unwieldy.

Instead, how about we just use a conditional branch alongside the existing
capabilities? E.g.


	alternative_if ARM64_MTE
		isb
		b	1f
	alternative_else_nop_endif
	alternative_if ARM64_HAS_ADDRESS_AUTH
		isb
	alternative_else_nop_endif
1:

?

Failing that, maybe you could use alternative_cb to avoid the new
capability, but I'm not sure it's worth it.

Will



More information about the linux-arm-kernel mailing list