[PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths
Alex Bennée
alex.bennee at linaro.org
Thu Sep 28 10:32:16 PDT 2017
Dave Martin <Dave.Martin at arm.com> writes:
> On Thu, Sep 14, 2017 at 10:45:07AM +0100, Alex Bennée wrote:
>>
>> Dave Martin <Dave.Martin at arm.com> writes:
>>
>> > This patch uses the cpufeatures framework to determine common SVE
>> > capabilities and vector lengths, and configures the runtime SVE
>> > support code appropriately.
>> >
>> > ZCR_ELx is not really a feature register, but it is convenient to
>> > use it as a template for recording the maximum vector length
>> > supported by a CPU, using the LEN field. This field is similar to
>> > a feature field in that it is a contiguous bitfield for which we
>> > want to determine the minimum system-wide value. This patch adds
>> > ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate
>> > custom code to populate it. Finding the minimum supported value of
>> > the LEN field is left to the cpufeatures framework in the usual
>> > way.
>> >
>> > The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,
>> > so for now we just require it to be zero.
>> >
>> > Note that much of this code is dormant and SVE still won't be used
>> > yet, since system_supports_sve() remains hardwired to false.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin at arm.com>
>> > Cc: Alex Bennée <alex.bennee at linaro.org>
>> > Cc: Suzuki K Poulose <Suzuki.Poulose at arm.com>
>> >
>> > ---
>> >
>> > Changes since v1
>> > ----------------
>> >
>> > Requested by Alex Bennée:
>> >
>> > * Thin out BUG_ON()s:
>> > Redundant BUG_ON()s and ones that just check invariants are removed.
>> > Important sanity-checks are migrated to WARN_ON()s, with some
>> > minimal best-effort patch-up code.
>> >
>> > Other changes related to Alex Bennée's comments:
>> >
>> > * Migrate away from magic numbers for converting VL to VQ.
>> >
>> > Requested by Suzuki Poulose:
>> >
>> > * Make sve_vq_map __ro_after_init.
>> >
>> > Other changes related to Suzuki Poulose's comments:
>> >
>> > * Rely on cpufeatures for not attempting to update the vq map after boot.
>> > ---
>> > arch/arm64/include/asm/cpu.h | 4 ++
>> > arch/arm64/include/asm/cpufeature.h | 29 ++++++++++
>> > arch/arm64/include/asm/fpsimd.h | 10 ++++
>> > arch/arm64/kernel/cpufeature.c | 50 +++++++++++++++++
>> > arch/arm64/kernel/cpuinfo.c | 6 ++
>> > arch/arm64/kernel/fpsimd.c | 106 +++++++++++++++++++++++++++++++++++-
>> > 6 files changed, 202 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
>> > index 889226b..8839227 100644
>> > --- a/arch/arm64/include/asm/cpu.h
>> > +++ b/arch/arm64/include/asm/cpu.h
>> > @@ -41,6 +41,7 @@ struct cpuinfo_arm64 {
>> > u64 reg_id_aa64mmfr2;
>> > u64 reg_id_aa64pfr0;
>> > u64 reg_id_aa64pfr1;
>> > + u64 reg_id_aa64zfr0;
>> >
>> > u32 reg_id_dfr0;
>> > u32 reg_id_isar0;
>> > @@ -59,6 +60,9 @@ struct cpuinfo_arm64 {
>> > u32 reg_mvfr0;
>> > u32 reg_mvfr1;
>> > u32 reg_mvfr2;
>> > +
>> > + /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */
>> > + u64 reg_zcr;
>> > };
>> >
>> > DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
>> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> > index 4ea3441..d98e7ba 100644
>> > --- a/arch/arm64/include/asm/cpufeature.h
>> > +++ b/arch/arm64/include/asm/cpufeature.h
>> > @@ -10,7 +10,9 @@
>> > #define __ASM_CPUFEATURE_H
>> >
>> > #include <asm/cpucaps.h>
>> > +#include <asm/fpsimd.h>
>> > #include <asm/hwcap.h>
>> > +#include <asm/sigcontext.h>
>> > #include <asm/sysreg.h>
>> >
>> > /*
>> > @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)
>> > return val == ID_AA64PFR0_EL0_32BIT_64BIT;
>> > }
>> >
>> > +static inline bool id_aa64pfr0_sve(u64 pfr0)
>> > +{
>> > + u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT);
>> > +
>> > + return val > 0;
>> > +}
>> > +
>> > void __init setup_cpu_features(void);
>> >
>> > void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
>> > @@ -267,6 +276,26 @@ static inline bool system_supports_sve(void)
>> > return false;
>> > }
>> >
>> > +/*
>> > + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
>> > + * vector length.
>> > + * Use only if SVE is present. This function clobbers the SVE vector length.
>> > + */
>>
>> :nit whitespace formatting.
>
> I'll add some newlines now to make this cleaner.
>
> /*
> * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE
> * vector length.
> *
> * Use only if SVE is present.
> * This function clobbers the SVE vector length.
> */
>
> OK?
Yep.
>
>>
>> > +static u64 __maybe_unused read_zcr_features(void)
>> > +{
>> > + u64 zcr;
>> > + unsigned int vq_max;
>> > +
>> > + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
>>
>> I'm confused, why are we writing something here? You mention clobbering
>> the SVE vector length but what was the point?
>
> Hmm, this deserves a comment -- coming back to this code, I had to think
> about it. Are the following extra comments sufficient explanation?
>
> /*
> * Set the maximum possible VL, and write zeroes to all other
> * bits to see if they stick.
> */
> write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
>
> zcr = read_sysreg_s(SYS_ZCR_EL1);
> zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* flag up sticky 1s outside LEN field */
> vq_max = sve_vq_from_vl(sve_get_vl());
> zcr |= vq_max - 1; /* set LEN field to maximum effective value */
>
>
> [...]
OK that makes more sense. Thanks.
>
>> Otherwise:
>>
>> Reviewed-by: Alex Bennée <alex.bennee at linaro.org>
>
> I'll wait on your responses to the above first.
Still good ;-)
>
> Cheers
> ---Dave
--
Alex Bennée
More information about the linux-arm-kernel
mailing list