[PATCH] arm64/simd: Avoid pointless clearing of FP/SIMD buffer
Eric Biggers
ebiggers at kernel.org
Mon Dec 8 15:24:22 PST 2025
On Sun, Dec 07, 2025 at 10:59:29AM +0100, Ard Biesheuvel wrote:
> On Sun, 7 Dec 2025 at 02:30, Eric Biggers <ebiggers at kernel.org> wrote:
> >
> > On Fri, Dec 05, 2025 at 09:13:46AM +0100, Ard Biesheuvel wrote:
> > > On Fri, 5 Dec 2025 at 07:50, Eric Biggers <ebiggers at kernel.org> wrote:
> > > >
> > > > On Thu, Dec 04, 2025 at 05:28:15PM +0100, Ard Biesheuvel wrote:
> > > > > The buffer provided to kernel_neon_begin() is only used if the task is
> > > > > scheduled out while the FP/SIMD is in use by the kernel, or when such a
> > > > > section is interrupted by a softirq that also uses the FP/SIMD.
> > > > >
> > > > > IOW, this happens rarely, and even if it happened often, there is still
> > > > > no reason for this buffer to be cleared beforehand, which happens by
> > > > > default when using a compiler that supports -ftrivial-auto-var-init.
> > > > >
> > > > > So mark the buffer as __uninitialized. Given that this is a variable
> > > > > attribute not a type attribute, this requires that the expression is
> > > > > tweaked a bit.
> > > > >
> > > > > Cc: Will Deacon <will at kernel.org>,
> > > > > Cc: Catalin Marinas <catalin.marinas at arm.com>,
> > > > > Cc: Kees Cook <keescook at chromium.org>
> > > > > Cc: Eric Biggers <ebiggers at kernel.org>
> > > > > Cc: Justin Stitt <justinstitt at google.com>
> > > > > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > > > > ---
> > > > > arch/arm64/include/asm/simd.h | 3 ++-
> > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > The issue here is that returning a pointer to an automatic variable as
> > > > > it goes out of scope is slightly dodgy, especially in the context of
> > > > > __attribute__((cleanup())), on which the scoped guard API relies
> > > > > heavily. However, in this case it should be safe, given that this
> > > > > expression is the input to the guarded variable type's constructor.
> > > > >
> > > > > It is definitely not pretty, though, so hopefully here is a better way
> > > > > to attach this.
> > > > >
> > > > > diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> > > > > index 0941f6f58a14..825b7fe94003 100644
> > > > > --- a/arch/arm64/include/asm/simd.h
> > > > > +++ b/arch/arm64/include/asm/simd.h
> > > > > @@ -48,6 +48,7 @@ DEFINE_LOCK_GUARD_1(ksimd,
> > > > > kernel_neon_begin(_T->lock),
> > > > > kernel_neon_end(_T->lock))
> > > > >
> > > > > -#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
> > > > > +#define scoped_ksimd() \
> > > > > + scoped_guard(ksimd, ({ struct user_fpsimd_state __uninitialized s; &s; }))
> > > >
> > > > Ick. I should have looked at the generated code more closely.
> > > >
> > > > It's actually worse than you describe, because the zeroing is there even
> > > > without CONFIG_INIT_STACK_ALL_ZERO=y, simply because the
> > > > user_fpsimd_state struct is declared using a compound literal.
> > > >
> > > > I'm afraid that this patch probably isn't a good idea, as it relies on
> > > > undefined behavior. Before this patch, the user_fpsimd_state is
> > > > declared using a compound literal, which takes on its enclosing scope,
> > > > i.e. the 'for' statement generated by scoped_guard(). After this patch,
> > > > it's in a new inner scope, and the pointer to it escapes from it.
> > > >
> > > > Unfortunately I don't think there's any way to solve this while keeping
> > > > the scoped_ksimd() API as-is.
> > > >
> > >
> > > How about
> > >
> > > --- a/arch/arm64/include/asm/simd.h
> > > +++ b/arch/arm64/include/asm/simd.h
> > > @@ -48,6 +48,8 @@ DEFINE_LOCK_GUARD_1(ksimd,
> > > kernel_neon_begin(_T->lock),
> > > kernel_neon_end(_T->lock))
> > >
> > > -#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
> > > +#define scoped_ksimd() __scoped_ksimd(__UNIQUE_ID(fpsimd_state))
> > > +#define __scoped_ksimd(id) struct user_fpsimd_state __uninitialized id; \
> > > + scoped_guard(ksimd, &id)
> >
> > I guess that will work. It's not great that it will make scoped_ksimd()
> > expand into more than one statement, which is error-prone and not
> > normally allowed in macros. But it looks okay for all the current users
> > of it.
> >
>
> We could always repeat the 'for()' trick that the cleanup helpers use, e.g.,
>
> for (struct user_fpsimd_state __uninitialized __st; true; ({ goto label; }))
> if (0) {
> label: break;
> } else scoped_guard(ksimd, &__st)
>
> Would you prefer that?
Hmm, I didn't consider using nested 'for' statements. It looks like
that should solve the problem. It makes the implementation a bit harder
to understand, but at least the 'for' statement trick isn't new...
Could you send a patch? Preferably with a clear comment that documents
why it's done this way.
- Eric
More information about the linux-arm-kernel
mailing list