[PATCH] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
Marc Zyngier
maz at kernel.org
Fri Jan 3 09:58:18 PST 2025
On Fri, 03 Jan 2025 17:20:05 +0000,
Catalin Marinas <catalin.marinas at arm.com> wrote:
>
> On Fri, Jan 03, 2025 at 02:26:35PM +0000, Marc Zyngier wrote:
> > The hwcaps code that exposes SVE features to userspace only
> > considers ID_AA64ZFR0_EL1, while this is only valid when
> > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> >
> > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> > ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> >
> > Things become a bit more interesting if the HW implements SME.
> > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> > features. And these fields overlap with their SVE interpretations.
> > But the architecture says that the SME and SVE feature sets must
> > match, so we're still hunky-dory.
> >
> > This goes wrong if the HW implements SME, but not SVE. In this
> > case, we end-up advertising some SVE features to userspace, even
> > if the HW has none. That's because we never consider whether SVE
> > is actually implemented. Oh well.
> >
> > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> > being non-zero.
> >
> > Reported-by: Catalin Marinas <catalin.marinas at arm.com>
> > Signed-off-by: Marc Zyngier <maz at kernel.org>
> > Cc: Will Deacon <will at kernel.org>
> > Cc: Mark Rutland <mark.rutland at arm.com>
> > Cc: Mark Brown <broonie at kernel.org>
> > Cc: stable at vger.kernel.org
>
> I'd add:
>
> Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace")
>
> While at the time the code was correct, the architecture messed up our
> assumptions with the introduction of SME.
Good point.
>
> > @@ -3022,6 +3027,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> > .matches = match, \
> > }
> >
> > +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap) \
> > + { \
> > + __HWCAP_CAP(#cap, cap_type, cap) \
> > + HWCAP_CPUID_MATCH(reg, field, min_value) \
> > + .matches = match, \
> > + }
>
> Do we actually need this macro?
It is either this macro, or a large-ish switch/case statement doing
the same thing. See below.
>
> > +
> > #ifdef CONFIG_ARM64_PTR_AUTH
> > static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
> > {
> > @@ -3050,6 +3062,18 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
> > };
> > #endif
> >
> > +#ifdef CONFIG_ARM64_SVE
> > +static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope)
> > +{
> > + u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope);
> > +
> > + if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP)
> > + return false;
> > +
> > + return has_user_cpuid_feature(cap, scope);
> > +}
> > +#endif
>
> We can name this has_sve_feature() and use it with the existing
> HWCAP_CAP_MATCH() macro. I think it would look identical.
I don't think that works. HWCAP_CAP_MATCH() doesn't take the
reg/field/limit information that we need to compute the
capability. Without such information neatly populated in
arm64_cpu_capabilities, you can't call has_user_cpuid_feature().
A previous incarnation of the same patch was using that macro. But you
then end-up with having to map the cap with the field/limit and
perform the check "by hand". Roughly, this would look like this:
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d793ca08549cd..76566a8bcdd3c 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3065,12 +3065,22 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
#ifdef CONFIG_ARM64_SVE
static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope)
{
- u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope);
+ u64 zfr0;
- if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP)
+ if (!system_supports_sve())
return false;
- return has_user_cpuid_feature(cap, scope);
+ zfr0 = __read_scoped_sysreg(SYS_ID_AA64ZFR0_EL1, scope);
+
+ switch (cap->cap) {
+ case KERNEL_HWCAP_SVE2P1:
+ return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, SVEver) >= SYS_ID_AA64ZFR0_EL1_SVEver_SVE2p1;
+ case KERNEL_HWCAP_SVE2:
+ return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, SVEver) >= SYS_ID_AA64ZFR0_EL1_SVEver_SVE2;
+ case KERNEL_HWCAP_SVEAES:
+ return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, AES) >= SYS_ID_AA64ZFR0_EL1_AES_IMP;
+ [...]
+ }
}
#endif
Frankly, I don't think this is worth it, and you still need to hack
read_scoped_sysreg().
>
> We might even be able to use system_supports_sve() directly and avoid
> changing read_scoped_sysreg(). setup_user_features() is called in
> smp_cpus_done() after setup_system_features(), so using
> system_supports_sve() directly should be fine here.
Yeah, that should work.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list