[PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y

Will Deacon will at kernel.org
Wed Jul 1 06:19:23 EDT 2020


On Tue, Jun 30, 2020 at 09:25:03PM +0200, Arnd Bergmann wrote:
> On Tue, Jun 30, 2020 at 7:39 PM Will Deacon <will at kernel.org> wrote:
> > +#define __READ_ONCE(x)                                                 \
> > +({                                                                     \
> > +       int atomic = 1;                                                 \
> > +       union { __unqual_scalar_typeof(x) __val; char __c[1]; } __u;    \
> > +       typeof(&(x)) __x = &(x);                                        \
> > +       switch (sizeof(x)) {                                            \
> ...
> > +       atomic ? (typeof(x))__u.__val : (*(volatile typeof(x) *)__x);   \
> > +})
> 
> This expands (x) nine times (five in __unqual_scala_typeof()), which can
> lead to significant code bloat after preprocessing if something passes a
> compound expression into READ_ONCE().
> The compiler works it out eventually, but we've seen an actual slowdown
> in compile speed from this recently, especially on clang.
> 
> I think if you move the
> 
>         typeof(&(x)) __x = &(x);
> 
> line first, all other instances can use typeof(*__x) instead of typeof(x)
> and avoid this problem.

Cheers, I was only thinking about side-effects when I wrote this, but
bloating built time is very unpopular, so I'll go with your suggestion.

> Once we make gcc-4.9 the minimum version,
> this could be further improved to
> 
>        __auto_type __x = &(x);

Is anybody working on moving to 4.9? I've seen the mails from Linus
championing it, but I thought there was a RHEL in support that people
might care about?

Will



More information about the linux-arm-kernel mailing list