[PATCH] arm64/sme: More sensibly define the size for the ZA register set

Naresh Kamboju naresh.kamboju at linaro.org
Thu May 5 19:43:51 PDT 2022


Hi Mark

On Fri, 6 May 2022 at 03:45, Mark Brown <broonie at kernel.org> wrote:
>
> Since the vector length configuration mechanism is identical between SVE
> and SME we share large elements of the code including the definition for
> the maximum vector length. Unfortunately when we were defining the ABI
> for SVE we included not only the actual maximum vector length of 2048
> bits but also the value possible if all the bits reserved in the
> architecture for expansion of the LEN field were used, 16384 bits.
>
> This starts creating problems if we try to allocate anything for the ZA
> matrix based on the maximum possible vector length, as we do for the
> regset used with ptrace during the process of generating a core dump.
> While the maximum potential size for ZA with the current architecture is
> a reasonably managable 64K with the higher reserved limit ZA would be
> 64M which leads to entirely reasonable complaints from the memory
> management code when we try to allocate a buffer of that size. Avoid
> these issues by defining the actual maximum vector length for the
> architecture and using it for the SME regsets.
>
> Also use the full ZA_PT_SIZE() with the header rather than just the
> actual register payload when specifying the size, fixing support for the
> largest vector lengths now that we have this new, lower define. With the
> SVE maximum this did not cause problems due to the extra headroom we
> had.
>
> While we're at it add a comment clarifying why even though ZA is a
> single register we tell the regset code that it is a multi-register
> regset.

Thanks for providing the fix patch.
I have tested this patch and the reported warning got fixed.

>
> Reported-by: Qian Cai <quic_qiancai at quicinc.com>
> Signed-off-by: Mark Brown <broonie at kernel.org>

Tested-by: Naresh Kamboju <naresh.kamboju at linaro.org>

> ---
>  arch/arm64/include/asm/fpsimd.h | 12 ++++++++++++
>  arch/arm64/kernel/ptrace.c      | 12 ++++++++++--
>  2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index 6fd879fe02da..aa11dbec0d70 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -32,6 +32,18 @@
>  #define VFP_STATE_SIZE         ((32 * 8) + 4)
>  #endif
>
> +/*
> + * When we defined the maximum SVE vector length we defined the ABI so
> + * that the maximum vector length included all the reserved for future
> + * expansion bits in ZCR rather than those just currently defined by
> + * the architecture. While SME follows a similar pattern the fact that
> + * it includes a square matrix means that any allocations that attempt
> + * to cover the maximum potential vector length (such as happen with
> + * the regset used for ptrace) end up being extremely large. Define
> + * the much lower actual limit for use in such situations.
> + */
> +#define SME_VQ_MAX     16
> +
>  struct task_struct;
>
>  extern void fpsimd_save_state(struct user_fpsimd_state *state);
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 78085a32df83..140f5a8cc7c4 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -1438,7 +1438,7 @@ static const struct user_regset aarch64_regsets[] = {
>  #ifdef CONFIG_ARM64_SME
>         [REGSET_SSVE] = { /* Streaming mode SVE */
>                 .core_note_type = NT_ARM_SSVE,
> -               .n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE),
> +               .n = DIV_ROUND_UP(SVE_PT_SIZE(SME_VQ_MAX, SVE_PT_REGS_SVE),
>                                   SVE_VQ_BYTES),
>                 .size = SVE_VQ_BYTES,
>                 .align = SVE_VQ_BYTES,
> @@ -1447,7 +1447,15 @@ static const struct user_regset aarch64_regsets[] = {
>         },
>         [REGSET_ZA] = { /* SME ZA */
>                 .core_note_type = NT_ARM_ZA,
> -               .n = DIV_ROUND_UP(ZA_PT_ZA_SIZE(SVE_VQ_MAX), SVE_VQ_BYTES),
> +               /*
> +                * ZA is a single register but it's variably sized and
> +                * the ptrace core requires that the size of any data
> +                * be an exact multiple of the configured register
> +                * size so report as though we had SVE_VQ_BYTES
> +                * registers. These values aren't exposed to
> +                * userspace.
> +                */
> +               .n = DIV_ROUND_UP(ZA_PT_SIZE(SME_VQ_MAX), SVE_VQ_BYTES),
>                 .size = SVE_VQ_BYTES,
>                 .align = SVE_VQ_BYTES,
>                 .regset_get = za_get,
> --
> 2.30.2
>

Test log link,
https://lkft.validation.linaro.org/scheduler/job/4993362

- Naresh



More information about the linux-arm-kernel mailing list