[PATCH] arm64/sme: More sensibly define the size for the ZA register set
Naresh Kamboju
naresh.kamboju at linaro.org
Thu May 5 19:43:51 PDT 2022
Hi Mark
On Fri, 6 May 2022 at 03:45, Mark Brown <broonie at kernel.org> wrote:
>
> Since the vector length configuration mechanism is identical between SVE
> and SME we share large elements of the code including the definition for
> the maximum vector length. Unfortunately when we were defining the ABI
> for SVE we included not only the actual maximum vector length of 2048
> bits but also the value possible if all the bits reserved in the
> architecture for expansion of the LEN field were used, 16384 bits.
>
> This starts creating problems if we try to allocate anything for the ZA
> matrix based on the maximum possible vector length, as we do for the
> regset used with ptrace during the process of generating a core dump.
> While the maximum potential size for ZA with the current architecture is
> a reasonably managable 64K with the higher reserved limit ZA would be
> 64M which leads to entirely reasonable complaints from the memory
> management code when we try to allocate a buffer of that size. Avoid
> these issues by defining the actual maximum vector length for the
> architecture and using it for the SME regsets.
>
> Also use the full ZA_PT_SIZE() with the header rather than just the
> actual register payload when specifying the size, fixing support for the
> largest vector lengths now that we have this new, lower define. With the
> SVE maximum this did not cause problems due to the extra headroom we
> had.
>
> While we're at it add a comment clarifying why even though ZA is a
> single register we tell the regset code that it is a multi-register
> regset.
Thanks for providing the fix patch.
I have tested this patch and the reported warning got fixed.
>
> Reported-by: Qian Cai <quic_qiancai at quicinc.com>
> Signed-off-by: Mark Brown <broonie at kernel.org>
Tested-by: Naresh Kamboju <naresh.kamboju at linaro.org>
> ---
> arch/arm64/include/asm/fpsimd.h | 12 ++++++++++++
> arch/arm64/kernel/ptrace.c | 12 ++++++++++--
> 2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index 6fd879fe02da..aa11dbec0d70 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -32,6 +32,18 @@
> #define VFP_STATE_SIZE ((32 * 8) + 4)
> #endif
>
> +/*
> + * When we defined the maximum SVE vector length we defined the ABI so
> + * that the maximum vector length included all the reserved for future
> + * expansion bits in ZCR rather than those just currently defined by
> + * the architecture. While SME follows a similar pattern the fact that
> + * it includes a square matrix means that any allocations that attempt
> + * to cover the maximum potential vector length (such as happen with
> + * the regset used for ptrace) end up being extremely large. Define
> + * the much lower actual limit for use in such situations.
> + */
> +#define SME_VQ_MAX 16
> +
> struct task_struct;
>
> extern void fpsimd_save_state(struct user_fpsimd_state *state);
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 78085a32df83..140f5a8cc7c4 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -1438,7 +1438,7 @@ static const struct user_regset aarch64_regsets[] = {
> #ifdef CONFIG_ARM64_SME
> [REGSET_SSVE] = { /* Streaming mode SVE */
> .core_note_type = NT_ARM_SSVE,
> - .n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE),
> + .n = DIV_ROUND_UP(SVE_PT_SIZE(SME_VQ_MAX, SVE_PT_REGS_SVE),
> SVE_VQ_BYTES),
> .size = SVE_VQ_BYTES,
> .align = SVE_VQ_BYTES,
> @@ -1447,7 +1447,15 @@ static const struct user_regset aarch64_regsets[] = {
> },
> [REGSET_ZA] = { /* SME ZA */
> .core_note_type = NT_ARM_ZA,
> - .n = DIV_ROUND_UP(ZA_PT_ZA_SIZE(SVE_VQ_MAX), SVE_VQ_BYTES),
> + /*
> + * ZA is a single register but it's variably sized and
> + * the ptrace core requires that the size of any data
> + * be an exact multiple of the configured register
> + * size so report as though we had SVE_VQ_BYTES
> + * registers. These values aren't exposed to
> + * userspace.
> + */
> + .n = DIV_ROUND_UP(ZA_PT_SIZE(SME_VQ_MAX), SVE_VQ_BYTES),
> .size = SVE_VQ_BYTES,
> .align = SVE_VQ_BYTES,
> .regset_get = za_get,
> --
> 2.30.2
>
Test log link,
https://lkft.validation.linaro.org/scheduler/job/4993362
- Naresh
More information about the linux-arm-kernel
mailing list