[PATCH 2/3] arm64: Optimize __READ_ONCE() with CONFIG_LTO=y

Marco Elver elver at google.com
Mon Jan 26 11:54:24 PST 2026


On Mon, Jan 26, 2026 at 08:56AM +0100, Arnd Bergmann wrote:
> On Mon, Jan 26, 2026, at 01:25, Marco Elver wrote:
> > diff --git a/arch/arm64/include/asm/rwonce.h b/arch/arm64/include/asm/rwonce.h
> > index fc0fb42b0b64..9963948f4b44 100644
> > --- a/arch/arm64/include/asm/rwonce.h
> > +++ b/arch/arm64/include/asm/rwonce.h
> > @@ -32,8 +32,7 @@
> >  #define __READ_ONCE(x)							\
> >  ({									\
> >  	typeof(&(x)) __x = &(x);					\
> > -	int atomic = 1;							\
> > -	union { __unqual_scalar_typeof(*__x) __val; char __c[1]; } __u;	\
> > +	union { TYPEOF_UNQUAL(*__x) __val; char __c[1]; } __u;		\
> >  	switch (sizeof(x)) {						\
> >  	case 1:								\
> >  		asm volatile(__LOAD_RCPC(b, %w0, %1)			\
> 
> How does this work with CC_HAS_TYPEOF_UNQUAL=false?
> 
> As far as I can tell, TYPEOF_UNQUAL() falls back to __typeof__
> on gcc-13, clang-18 and earlier, and not strip out qualifiers.

I think we only need to worry about Clang for LTO builds. But yeah, our
minimum supported Clang is 15, so between 15-18 it'd be broken.

> With fd69b2f7d5f4 ("compiler: Use __typeof_unqual__() for
> __unqual_scalar_typeof()"), I would expect __unqual_scalar_typeof()
> to do the right thing already.

It'd still be broken for Clang 15-18, so it won't help much. We need
this to work for more than "scalar", so even though it'll work for Clang
19+ given the redefinition to __typeof_unqual__, we should deprecate the
_Generic-based __unqual_scalar_typeof() sooner than later.

I was able to make this work for older compilers:

diff --git a/arch/arm64/include/asm/rwonce.h b/arch/arm64/include/asm/rwonce.h
index 85b1dd7b0274..d6c808cc01be 100644
--- a/arch/arm64/include/asm/rwonce.h
+++ b/arch/arm64/include/asm/rwonce.h
@@ -19,6 +19,18 @@
 		"ldapr"	#sfx "\t" #regs,				\
 	ARM64_HAS_LDAPR)
 
+#ifdef USE_TYPEOF_UNQUAL
+#define __read_once_typeof(x) TYPEOF_UNQUAL(x)
+#else
+/*
+ * Fallback for older compilers to infer an unqualified type, using the fact
+ * that __auto_type is supposed to drop qualifiers. Unlike typeof_unqual(), the
+ * type must be complete (defines an unevaluated local variable). This must
+ * already be guaranteed because sizeof(x) is used in the __READ_ONCE macro.
+ */
+#define __read_once_typeof(x) typeof(({ __auto_type ____t = (x); ____t; }))
+#endif
+
 /*
  * When building with LTO, there is an increased risk of the compiler
  * converting an address dependency headed by a READ_ONCE() invocation
@@ -32,8 +44,8 @@
 #define __READ_ONCE(x)							\
 ({									\
 	auto __x = &(x);						\
-	auto __ret = (TYPEOF_UNQUAL(*__x) *)__x, *__retp = &__ret;	\
-	union { TYPEOF_UNQUAL(*__x) __val; char __c[1]; } __u;		\
+	auto __ret = (__read_once_typeof(*__x) *)__x, *__retp = &__ret;	\
+	union { __read_once_typeof(*__x) __val; char __c[1]; } __u;	\
 	*__retp = &__u.__val;						\
 	switch (sizeof(x)) {						\
 	case 1:								\


Thoughts?



More information about the linux-arm-kernel mailing list