[PATCH v3 2/3] arm64: Optimize __READ_ONCE() with CONFIG_LTO=y
Marco Elver
elver at google.com
Mon Feb 2 09:48:47 PST 2026
On Mon, Feb 02, 2026 at 03:36:40PM +0000, Will Deacon wrote:
> I know that CONFIG_LTO practically depends on Clang, but it's a bit
> grotty relying on that assumption here. Ideally, it would be
> straightforward to enable the strong READ_ONCE() semantics on arm64
> regardless of the compiler.
Does it matter for GCC versions that do not support LTO? Because I'm
quite sure that if, one day, we add support for GCC LTO, that GCC
version will be new enough that it'll just take the __typeof_unqual__()
version and it'll "just work".
The problem with older GCC versions was that their __auto_type did not
actually strip qualifiers (which it should have) -- this was fixed at
some point.
On Mon, Feb 02, 2026 at 04:05PM +0000, Will Deacon wrote:
> On Mon, Feb 02, 2026 at 05:01:39PM +0100, Peter Zijlstra wrote:
> > On Mon, Feb 02, 2026 at 03:36:40PM +0000, Will Deacon wrote:
> >
> > > Since we're not providing acquire semantics for the non-atomic case,
> > > what we really want is the generic definition of __READ_ONCE() from
> > > include/asm-generic/rwonce.h here. The header inclusion mess prevents
> > > that, but why can't we just inline that definition here for the
> > > 'default' case? If TYPEOF_UNQUAL() leads to better codegen, shouldn't
> > > we use that to implement __unqual_scalar_typeof() when it is available?
> >
> > We are?
>
> Great! Then I don't grok why we need to choose between
> __unqual_scalar_typeof() and __typeof_unqual__() in the arch code. We
> should just use the former and it will DTRT.
The old __unqual_scalar_typeof() is still broken where
__typeof_unqual__() is unavailable - for the arm64 + LTO case that'd be
Clang <= 18, which we still have to support.
We could probably just ignore the performance issue ('volatile' reload
from stack, rare enough though given volatile variables are not usually
allowed) for these older versions and just say "use the newer compiler
to get better perf", but the 'const' issue will break the build:
| --- a/arch/arm64/include/asm/rwonce.h
| +++ b/arch/arm64/include/asm/rwonce.h
| @@ -46,7 +46,7 @@
| #define __READ_ONCE(x) \
| ({ \
| auto __x = &(x); \
| - auto __ret = (__rwonce_typeof_unqual(*__x) *)__x; \
| + auto __ret = (__unqual_scalar_typeof(*__x) *)__x; \
| /* Hides alias reassignment from Clang's -Wthread-safety. */ \
| auto __retp = &__ret; \
| union { typeof(*__ret) __val; char __c[1]; } __u; \
Results in:
| In file included from arch/arm64/kernel/asm-offsets.c:11:
| In file included from ./include/linux/arm_sdei.h:8:
| In file included from ./include/acpi/ghes.h:5:
| In file included from ./include/acpi/apei.h:9:
| In file included from ./include/linux/acpi.h:15:
| In file included from ./include/linux/device.h:32:
| In file included from ./include/linux/device/driver.h:21:
| In file included from ./include/linux/module.h:20:
| In file included from ./include/linux/elf.h:6:
| In file included from ./arch/arm64/include/asm/elf.h:141:
| ./include/linux/fs.h:1344:9: error: cannot assign to non-static data member '__val' with const-qualified type 'typeof (*__ret)' (aka 'struct fown_struct *const')
| 1344 | return READ_ONCE(file->f_owner);
| | ^~~~~~~~~~~~~~~~~~~~~~~~
| ./include/asm-generic/rwonce.h:50:2: note: expanded from macro 'READ_ONCE'
| 50 | __READ_ONCE(x); \
| | ^~~~~~~~~~~~~~
| ./arch/arm64/include/asm/rwonce.h:76:13: note: expanded from macro '__READ_ONCE'
| 76 | __u.__val = *(volatile typeof(*__x) *)__x; \
| | ~~~~~~~~~ ^
| ./include/linux/fs.h:1344:9: note: non-static data member '__val' declared const here
| 1344 | return READ_ONCE(file->f_owner);
| | ^~~~~~~~~~~~~~~~~~~~~~~~
| ./include/asm-generic/rwonce.h:50:2: note: expanded from macro 'READ_ONCE'
| 50 | __READ_ONCE(x); \
| | ^~~~~~~~~~~~~~
| ./arch/arm64/include/asm/rwonce.h:52:25: note: expanded from macro '__READ_ONCE'
| 52 | union { typeof(*__ret) __val; char __c[1]; } __u; \
| | ~~~~~~~~~~~~~~~^~~~~
... and many many more such errors.
It's an unfortunate mess today, but I hope sooner than later we bump the
minimum compiler versions that we can just unconditionally use
__typeof_unqual__() and delete __unqual_scalar_typeof(),
__rwonce_typeof_unqual() workaround and all the other code that appears
to be conditional on USE_TYPEOF_UNQUAL:
% git grep USE_TYPEOF_UNQUAL
arch/x86/include/asm/percpu.h:#if defined(CONFIG_USE_X86_SEG_SUPPORT) && defined(USE_TYPEOF_UNQUAL)
More information about the linux-arm-kernel
mailing list