[PATCH v3 3/3] arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y
David Laight
david.laight.linux at gmail.com
Mon Feb 16 03:09:15 PST 2026
On Sun, 15 Feb 2026 15:40:07 -0800
Linus Torvalds <torvalds at linux-foundation.org> wrote:
> On Sun, 15 Feb 2026 at 14:44, Marco Elver <elver at google.com> wrote:
> >
> > I found e.g. xen_get_runstate_snapshot_cpu_delta() uses the >8 byte
> > case via __READ_ONCE(). READ_ONCE() itself is already restricted to <=
> > 8 bytes (due to that static assert), but that itself uses the
> > __READ_ONCE() helper which these patches were touching.
>
> I think we could just make __READ_ONCE() be totally unchecked - just
> make it be "const volatile typeof()" and leave it at that.
>
> Anybody who uses __READ_ONCE() would then have to deal with it.
>
> There are very few users of that left, I think it's mainly
> arch_atomic_read(), which just doesn't want any of the checking that a
> regular "READ_ONCE()" does.
>
> In fact, there are *so* few users left that I think we could probably
> just remove __READ_ONCE() entirely, and make our "atomic_t" use
> "volatile" in the type itself.
>
> I generally absolutely hate volatile on data structures - it's a
> design mistake (an understandable one: it was at least partly designed
> for memory mapped IO accesses), and the volatile should be in the
> accesses, not the data, because very often the volatility of the data
> depends on context, not on the data.
IIRC The bots are now bleating when there is a READ_ONCE() in one
place but a write without a matching WRITE_ONCE().
That is effectively forcing all the accesses to be volatile regardless
of the context.
Which is pretty much equivalent to making the structure members volatile.
So if all you care about is 'inverted CSE' and read/write tearing
then volatile structure members DTRT.
What you probably don't want is 'volatile struct foo *'.
volatile structure members are almost free, you lose CSE and some versions
of gcc 'forget' that a load zero/sign extended a byte and do it again.
(I had to use barrier() rather than volatile in some code where I really
did care about single instructions.)
I've never see gcc reload a local, but I have seen it do CSE then
spill the value to stack.
David
>
> But our "atomic_t" is already properly wrapped, and nobody should be
> accessing it with anything but our helpers, so putting the volatile
> there looks ok.
>
> Linus
>
More information about the linux-arm-kernel
mailing list