[PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths

Alex Bennée alex.bennee at linaro.org
Thu Sep 28 10:32:16 PDT 2017


Dave Martin <Dave.Martin at arm.com> writes:

> On Thu, Sep 14, 2017 at 10:45:07AM +0100, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin at arm.com> writes:
>>
>> > This patch uses the cpufeatures framework to determine common SVE
>> > capabilities and vector lengths, and configures the runtime SVE
>> > support code appropriately.
>> >
>> > ZCR_ELx is not really a feature register, but it is convenient to
>> > use it as a template for recording the maximum vector length
>> > supported by a CPU, using the LEN field.  This field is similar to
>> > a feature field in that it is a contiguous bitfield for which we
>> > want to determine the minimum system-wide value.  This patch adds
>> > ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate
>> > custom code to populate it.  Finding the minimum supported value of
>> > the LEN field is left to the cpufeatures framework in the usual
>> > way.
>> >
>> > The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,
>> > so for now we just require it to be zero.
>> >
>> > Note that much of this code is dormant and SVE still won't be used
>> > yet, since system_supports_sve() remains hardwired to false.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin at arm.com>
>> > Cc: Alex Bennée <alex.bennee at linaro.org>
>> > Cc: Suzuki K Poulose <Suzuki.Poulose at arm.com>
>> >
>> > ---
>> >
>> > Changes since v1
>> > ----------------
>> >
>> > Requested by Alex Bennée:
>> >
>> > * Thin out BUG_ON()s:
>> > Redundant BUG_ON()s and ones that just check invariants are removed.
>> > Important sanity-checks are migrated to WARN_ON()s, with some
>> > minimal best-effort patch-up code.
>> >
>> > Other changes related to Alex Bennée's comments:
>> >
>> > * Migrate away from magic numbers for converting VL to VQ.
>> >
>> > Requested by Suzuki Poulose:
>> >
>> > * Make sve_vq_map __ro_after_init.
>> >
>> > Other changes related to Suzuki Poulose's comments:
>> >
>> > * Rely on cpufeatures for not attempting to update the vq map after boot.
>> > ---
>> >  arch/arm64/include/asm/cpu.h        |   4 ++
>> >  arch/arm64/include/asm/cpufeature.h |  29 ++++++++++
>> >  arch/arm64/include/asm/fpsimd.h     |  10 ++++
>> >  arch/arm64/kernel/cpufeature.c      |  50 +++++++++++++++++
>> >  arch/arm64/kernel/cpuinfo.c         |   6 ++
>> >  arch/arm64/kernel/fpsimd.c          | 106 +++++++++++++++++++++++++++++++++++-
>> >  6 files changed, 202 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
>> > index 889226b..8839227 100644
>> > --- a/arch/arm64/include/asm/cpu.h
>> > +++ b/arch/arm64/include/asm/cpu.h
>> > @@ -41,6 +41,7 @@ struct cpuinfo_arm64 {
>> >  	u64		reg_id_aa64mmfr2;
>> >  	u64		reg_id_aa64pfr0;
>> >  	u64		reg_id_aa64pfr1;
>> > +	u64		reg_id_aa64zfr0;
>> >
>> >  	u32		reg_id_dfr0;
>> >  	u32		reg_id_isar0;
>> > @@ -59,6 +60,9 @@ struct cpuinfo_arm64 {
>> >  	u32		reg_mvfr0;
>> >  	u32		reg_mvfr1;
>> >  	u32		reg_mvfr2;
>> > +
>> > +	/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */
>> > +	u64		reg_zcr;
>> >  };
>> >
>> >  DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
>> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> > index 4ea3441..d98e7ba 100644
>> > --- a/arch/arm64/include/asm/cpufeature.h
>> > +++ b/arch/arm64/include/asm/cpufeature.h
>> > @@ -10,7 +10,9 @@
>> >  #define __ASM_CPUFEATURE_H
>> >
>> >  #include <asm/cpucaps.h>
>> > +#include <asm/fpsimd.h>
>> >  #include <asm/hwcap.h>
>> > +#include <asm/sigcontext.h>
>> >  #include <asm/sysreg.h>
>> >
>> >  /*
>> > @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)
>> >  	return val == ID_AA64PFR0_EL0_32BIT_64BIT;
>> >  }
>> >
>> > +static inline bool id_aa64pfr0_sve(u64 pfr0)
>> > +{
>> > +	u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT);
>> > +
>> > +	return val > 0;
>> > +}
>> > +
>> >  void __init setup_cpu_features(void);
>> >
>> >  void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
>> > @@ -267,6 +276,26 @@ static inline bool system_supports_sve(void)
>> >  	return false;
>> >  }
>> >
>> > +/*
>> > + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
>> > + * vector length.
>> > + * Use only if SVE is present.  This function clobbers the SVE vector length.
>> > + */
>>
>> :nit whitespace formatting.
>
> I'll add some newlines now to make this cleaner.
>
> /*
>  * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
>  * vector length.
>  *
>  * Use only if SVE is present.
>  * This function clobbers the SVE vector length.
>  */
>
> OK?

Yep.
>
>>
>> > +static u64 __maybe_unused read_zcr_features(void)
>> > +{
>> > +	u64 zcr;
>> > +	unsigned int vq_max;
>> > +
>> > +	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
>>
>> I'm confused, why are we writing something here? You mention clobbering
>> the SVE vector length but what was the point?
>
> Hmm, this deserves a comment -- coming back to this code, I had to think
> about it.  Are the following extra comments sufficient explanation?
>
> 	/*
> 	 * Set the maximum possible VL, and write zeroes to all other
> 	 * bits to see if they stick.
> 	 */
> 	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
>
> 	zcr = read_sysreg_s(SYS_ZCR_EL1);
> 	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* flag up sticky 1s outside LEN field */
> 	vq_max = sve_vq_from_vl(sve_get_vl());
> 	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
>
>
> [...]

OK that makes more sense. Thanks.
>
>> Otherwise:
>>
>> Reviewed-by: Alex Bennée <alex.bennee at linaro.org>
>
> I'll wait on your responses to the above first.

Still good ;-)

>
> Cheers
> ---Dave


--
Alex Bennée



More information about the linux-arm-kernel mailing list