[PATCH v2 1/2] arm64: errata: Remove AES hwcap for COMPAT tasks

Will Deacon will at kernel.org
Wed Apr 20 03:17:56 PDT 2022


On Thu, Apr 14, 2022 at 06:43:32PM +0100, James Morse wrote:
> On 14/04/2022 11:03, Will Deacon wrote:
> > On Wed, Apr 13, 2022 at 06:05:44PM +0100, James Morse wrote:
> >> Cortex-A57 and Cortex-A72 have an erratum where an interrupt that
> >> occurs between a pair of AES instructions in aarch32 mode may corrupt
> >> the ELR. The task will subsequently produce the wrong AES result.
> >>
> >> The AES instructions are part of the cryptographic extensions, which are
> >> optional. User-space software will detect the support for these
> >> instructions from the hwcaps. If the platform doesn't support these
> >> instructions a software implementation should be used.
> >>
> >> Remove the hwcap bits on affected parts to indicate user-space should
> >> not use the AES instructions.
> 
> >> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >> index d72c4b4d389c..3faf16f1c040 100644
> >> --- a/arch/arm64/kernel/cpufeature.c
> >> +++ b/arch/arm64/kernel/cpufeature.c
> >> @@ -1922,6 +1922,12 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
> >>  }
> >>  #endif /* CONFIG_ARM64_MTE */
> >>  
> >> +static void elf_hwcap_fixup(void)
> >> +{
> >> +	if (cpus_have_const_cap(ARM64_WORKAROUND_1742098))
> >> +		compat_elf_hwcap2 &= ~COMPAT_HWCAP2_AES;
> >> +}
> > 
> > How does this deal with big/little if we late online an affected CPU?  It
> > would probably be easier if we treated these CPUs as not having the 32-bit
> > AES instructions at all (rather than removing the hwcap later), then the
> > early cap check would prevent late onlining.
> 
> I thought any new CPU to online late with a new errata was rejected by the
> type == ARM64_CPUCAP_LOCAL_CPU_ERRATUM. Suzuki's documentation in cpufeature.h has:
> | However, it is not safe if a "late" CPU requires a workaround and the system hasn't
> | enabled it already.
> 
> In this case verify_local_cpu_caps() would take the else for 'system_has_cap', and because
> the cpu matches, but ARM64_CPUCAP_LOCAL_CPU_ERRATUM doesn't have the 'late cpu permitted'
> bit set, it should call cpu_die_early().
> 
> That said - I haven't tested this configuration. (I'll give it a go with the model)

Ah yes, that probably works, but please do test it to confirm.

> v1 did as you suggest - but the HWCAPs are built from the id registers, and touching the
> id registers will regress KVM guest migration as the id registers are both visible to
> Qemu, and invariant.

Hmm, is that really something we expect to work in general? It seems to me
that any erratum workaround which effectively removes functionality is going
to be a blocker for migration.

Will



More information about the linux-arm-kernel mailing list