[PATCH v7 4/6] arm64: cpufeature: Modify address authentication cpufeature to exact

Catalin Marinas catalin.marinas at arm.com
Fri Sep 11 13:24:19 EDT 2020


On Wed, Sep 09, 2020 at 07:37:57PM +0530, Amit Daniel Kachhap wrote:
> The current address authentication cpufeature levels are set as LOWER_SAFE
> which is not compatible with the different configurations added for Armv8.3
> ptrauth enhancements as the different levels have different behaviour and
> there is no tunable to enable the lower safe versions. This is rectified
> by setting those cpufeature type as EXACT.
> 
> The current cpufeature framework also does not interfere in the booting of
> non-exact secondary cpus but rather marks them as tainted. As a workaround
> this is fixed by replacing the generic match handler with a new handler
> specific to ptrauth.
> 
> After this change, if there is any variation in ptrauth configurations in
> secondary cpus from boot cpu then those mismatched cpus are parked in an
> infinite loop.
> 
> Following ptrauth crash log is oberserved in Arm fastmodel with mismatched
> cpus without this fix,
> 
>  CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64ISAR1_EL1. Boot CPU: 0x11111110211402, CPU4: 0x11111110211102
>  CPU features: Unsupported CPU feature variation detected.
>  GICv3: CPU4: found redistributor 100 region 0:0x000000002f180000
>  CPU4: Booted secondary processor 0x0000000100 [0x410fd0f0]
>  Unable to handle kernel paging request at virtual address bfff800010dadf3c
>  Mem abort info:
>    ESR = 0x86000004
>    EC = 0x21: IABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>  [bfff800010dadf3c] address between user and kernel address ranges
>  Internal error: Oops: 86000004 [#1] PREEMPT SMP
>  Modules linked in:
>  CPU: 4 PID: 29 Comm: migration/4 Tainted: G S                5.8.0-rc4-00005-ge658591d66d1-dirty #158
>  Hardware name: Foundation-v8A (DT)
>  pstate: 60000089 (nZCv daIf -PAN -UAO BTYPE=--)
>  pc : 0xbfff800010dadf3c
>  lr : __schedule+0x2b4/0x5a8
>  sp : ffff800012043d70
>  x29: ffff800012043d70 x28: 0080000000000000
>  x27: ffff800011cbe000 x26: ffff00087ad37580
>  x25: ffff00087ad37000 x24: ffff800010de7d50
>  x23: ffff800011674018 x22: 0784800010dae2a8
>  x21: ffff00087ad37000 x20: ffff00087acb8000
>  x19: ffff00087f742100 x18: 0000000000000030
>  x17: 0000000000000000 x16: 0000000000000000
>  x15: ffff800011ac1000 x14: 00000000000001bd
>  x13: 0000000000000000 x12: 0000000000000000
>  x11: 0000000000000000 x10: 71519a147ddfeb82
>  x9 : 825d5ec0fb246314 x8 : ffff00087ad37dd8
>  x7 : 0000000000000000 x6 : 00000000fffedb0e
>  x5 : 00000000ffffffff x4 : 0000000000000000
>  x3 : 0000000000000028 x2 : ffff80086e11e000
>  x1 : ffff00087ad37000 x0 : ffff00087acdc600
>  Call trace:
>   0xbfff800010dadf3c
>   schedule+0x78/0x110
>   schedule_preempt_disabled+0x24/0x40
>   __kthread_parkme+0x68/0xd0
>   kthread+0x138/0x160
>   ret_from_fork+0x10/0x34
>  Code: bad PC value

That's what FTR_EXACT gives us. The variation above is in the field at
bit position 8 (API_SHIFT) with the boot CPU value of 4 and the
secondary CPU of 1, if I read it correctly.

Would it be better if the incompatible CPUs are just parked? I'm trying
to figure out from the verify_local_cpu_caps() code whether that's
possible. I don't fully understand why we don't trigger the "Detected
conflict for capability" message instead.

My reading of the code is that we set the system capability based on the
boot CPU since it's a BOOT_CPU_FEATURE. has_address_auth_cpucap() should
return false with this mismatch and verify_local_cpu_caps() would park
the CPU. Once parked, it shouldn't reach the sanity check since
cpuinfo_store_cpu() is called after check_local_cpu_capabilities() (and
the match function).

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 6424584be01e..4bb3f2b2ffed 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -197,9 +197,9 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FCMA_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_JSCVT_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
> -		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_API_SHIFT, 4, 0),
> +		       FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_API_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
> -		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_APA_SHIFT, 4, 0),
> +		       FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_APA_SHIFT, 4, 0),

So we go for FTR_EXACT here with a safe value of 0. It makes sense.
IIUC, in case of a mismatch, we end up with the sanitised register field
as 0.

>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_DPB_SHIFT, 4, 0),
>  	ARM64_FTR_END,
>  };
> @@ -1648,11 +1648,39 @@ static void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused)
>  #endif /* CONFIG_ARM64_RAS_EXTN */
>  
>  #ifdef CONFIG_ARM64_PTR_AUTH
> -static bool has_address_auth(const struct arm64_cpu_capabilities *entry,
> -			     int __unused)
> +static bool has_address_auth_cpucap(const struct arm64_cpu_capabilities *entry, int scope)
>  {
> -	return __system_matches_cap(ARM64_HAS_ADDRESS_AUTH_ARCH) ||
> -	       __system_matches_cap(ARM64_HAS_ADDRESS_AUTH_IMP_DEF);
> +	int boot_val, sec_val;
> +
> +	/* We don't expect to be called with SCOPE_SYSTEM */
> +	WARN_ON(scope == SCOPE_SYSTEM);
> +	/*
> +	 * The ptr-auth feature levels are not intercompatible with lower
> +	 * levels. Hence we must match ptr-auth feature level of the secondary
> +	 * CPUs with that of the boot CPU. The level of boot cpu is fetched
> +	 * from the sanitised register whereas direct register read is done for
> +	 * the secondary CPUs.
> +	 * The sanitised feature state is guaranteed to match that of the
> +	 * boot CPU as a mismatched secondary CPU is parked before it gets
> +	 * a chance to update the state, with the capability.
> +	 */
> +	boot_val = cpuid_feature_extract_field(read_sanitised_ftr_reg(entry->sys_reg),
> +					       entry->field_pos, entry->sign);
> +	if (scope & SCOPE_BOOT_CPU) {
> +		return boot_val >= entry->min_field_value;
> +	} else if (scope & SCOPE_LOCAL_CPU) {

Nitpick: just drop the else as we had a return already above. We could
also drop the SCOPE_LOCAL_CPU check since that's the only one left.

> +		sec_val = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg),
> +						      entry->field_pos, entry->sign);
> +		return (sec_val >= entry->min_field_value) && (sec_val == boot_val);

Another nitpick: do you still need the sec_val >= ... if you check for
sec_val == boot_val? boot_val was already checked against
min_field_value. That's unless the sanitised reg is updated and boot_val
above is 0 on subsequent CPUs. This could only happen if we don't park a
CPU that has a mismatch and it ends up updating the sysreg. AFAICT, for
a secondary CPU, we first check the features then we update the
sanitised regs.

-- 
Catalin



More information about the linux-arm-kernel mailing list