[PATCH v3 3/3] arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y

Linus Torvalds torvalds at linux-foundation.org
Tue Feb 17 08:23:34 PST 2026


On Mon, 16 Feb 2026 at 09:43, David Laight <david.laight.linux at gmail.com> wrote:
>
> >
> > Try doing something as simple as a "var++" on a volatile, and cry.
>
> On x86 I just see a load, inc, store - not that surprising really.
> (clang did do 'inc memory'.)
>
> It's not as though 'inc memory' is atomic (without a lock prefix).

That's not my point. My point is that it makes for absolutely
disgusting - and pointless - code generation.

That load + inc + store is a sign of the compiler missing truly basic
optimizations because "volatile" is so badly designed.

The thing is, we typically even *want* a single load. We actually want
not only to have basic optimizations that don't even change the
semantics - we typically even want CSE.

So we want basic optimizations and combining loads. The main reason to
use READ_ONCE() is actually a worry about compilers doing even worse
things, namely rematerialization or memory accesses - something that
compilers don't even do, because it's a bad idea, but people still
worry because they are _allowed_ to do it and who knows when something
silly happens.

So I want that READ_ONCE(), not "volatile" on data structures, because
*some* day we can rely on more modern things and compilers will
actually get it right if we do it as

 #define READ_ONCE(ptr) __atomic_load_n(ptr, __ATOMIC_RELAXED)

or similar.

But last time I looked at it - which was admittedly a few years ago -
the compilers we supported didn't actually do anything reasonable here
(ie the built-in atomics were fundamentally worse than the ones we do
by hand, and even basic things like __atomic_load_n() weren't
actually; better than just using 'volatile'.

Maybe that has changed. We've upgraded minimum compilers since.

             Linus



More information about the linux-arm-kernel mailing list