[PATCH 13/20] arm64/fpsimd: Make clone() compatible with ZA lazy saving
Mark Rutland
mark.rutland at arm.com
Wed May 7 08:22:06 PDT 2025
On Wed, May 07, 2025 at 03:58:01PM +0100, Will Deacon wrote:
> On Tue, May 06, 2025 at 04:25:16PM +0100, Mark Rutland wrote:
> > @@ -441,14 +449,39 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> > childregs->sp = stack_start;
> > }
> >
> > + /*
> > + * Due to the AAPCS64 "ZA lazy saving scheme", PSTATE.ZA and
> > + * TPIDR2 need to be manipulated as a pair, and either both
> > + * need to be inherited or both need to be reset.
> > + *
> > + * Within a process, child threads must not inherit their
> > + * parent's TPIDR2 value or they may clobber their parent's
> > + * stack at some later point.
> > + *
> > + * When a process is fork()'d, the child must inherit ZA and
> > + * TPIDR2 from its parent in case there was dormant ZA state.
> > + *
> > + * Use CLONE_VM to determine when the child will share the
> > + * address space with the parent, and cannot safely inherit the
> > + * state.
> > + */
> > + if (system_supports_sme()) {
> > + if (!(clone_flags & CLONE_VM)) {
> > + p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
>
> Why do we need to re-read this register given that we did this just a few
> lines earlier?
Sorry -- I had meant to delete the earlier read. My intent was to centralise
manipulation of TPIDR2 (and ZA) in this block so that it was clear that they
were manipulated as a pair.
I will delete the earlier read, and make this:
| if (system_supports_sme()) {
| if (!(clone_flags & CLONE_VM)) {
| p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
| ret = copy_thread_za(p, current);
| if (ret)
| return ret;
| } else {
| p->thread.tpidr2_el0 = 0;
| WARN_ON_ONCE(p->thread.svcr & SVCR_ZA_MASK);
| }
| }
... or I can clear TPIDR2 in arch_dup_task_struct() along with ZA, delete the
earlier read here, and make this:
| if (system_supports_sme() && !(clone_flags & CLONE_VM)) {
| p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
| ret = copy_thread_za(p, current);
| if (ret)
| return ret;
| }
Any preference?
[...]
> > diff --git a/tools/testing/selftests/arm64/abi/tpidr2.c b/tools/testing/selftests/arm64/abi/tpidr2.c
> Thank you for updating the selftest, but please can you put it in a
> separate patch?
Sure; I'll factor all the selftest changes out into separate patches.
Mark.
More information about the linux-arm-kernel
mailing list