[PATCH v5 2/5] arm64: barrier: Add smp_cond_load_relaxed_timeout()

Will Deacon will at kernel.org
Thu Sep 18 13:05:22 PDT 2025


On Wed, Sep 10, 2025 at 08:46:52PM -0700, Ankur Arora wrote:
> Add smp_cond_load_relaxed_timeout(), a timed variant of
> smp_cond_load_relaxed().
> 
> This uses __cmpwait_relaxed() to do the actual waiting, with the
> event-stream guaranteeing that we wake up from WFE periodically
> and not block forever in case there are no stores to the cacheline.
> 
> For cases when the event-stream is unavailable, fallback to
> spin-waiting.
> 
> Cc: Will Deacon <will at kernel.org>
> Cc: linux-arm-kernel at lists.infradead.org
> Suggested-by: Catalin Marinas <catalin.marinas at arm.com>
> Reviewed-by: Catalin Marinas <catalin.marinas at arm.com>
> Reviewed-by: Haris Okanovic <harisokn at amazon.com>
> Tested-by: Haris Okanovic <harisokn at amazon.com>
> Signed-off-by: Ankur Arora <ankur.a.arora at oracle.com>
> ---
>  arch/arm64/include/asm/barrier.h | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index f5801b0ba9e9..4f0d9ed7a072 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -219,6 +219,29 @@ do {									\
>  	(typeof(*ptr))VAL;						\
>  })
>  
> +/* Re-declared here to avoid include dependency. */
> +extern bool arch_timer_evtstrm_available(void);
> +
> +#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)	\
> +({									\
> +	typeof(ptr) __PTR = (ptr);					\
> +	__unqual_scalar_typeof(*ptr) VAL;				\
> +	bool __wfe = arch_timer_evtstrm_available();			\
> +									\
> +	for (;;) {							\
> +		VAL = READ_ONCE(*__PTR);				\
> +		if (cond_expr)						\
> +			break;						\
> +		if (time_check_expr)					\
> +			break;						\
> +		if (likely(__wfe))					\
> +			__cmpwait_relaxed(__PTR, VAL);			\
> +		else							\
> +			cpu_relax();					\

It'd be an awful lot nicer if we could just use the generic code if
wfe isn't available. One option would be to make that available as
e.g. __smp_cond_load_relaxed_timeout_cpu_relax() and call it from the
arch code when !arch_timer_evtstrm_available() but a potentially cleaner
version would be to introduce something like cpu_poll_relax() and use
that in the core code.

So arm64 would do:

#define SMP_TIMEOUT_SPIN_COUNT	1
#define cpu_poll_relax(ptr, val)	do {				\
	if (arch_timer_evtstrm_available())				\
		__cmpwait_relaxed(ptr, val);				\
	else								\
		cpu_relax();						\
} while (0)

and then the core code would have:

#ifndef cpu_poll_relax
#define cpu_poll_relax(p, v)	cpu_relax()
#endif

and could just use cpu_poll_relax() in the generic implementation of
smp_cond_load_relaxed_timeout().

Will



More information about the linux-arm-kernel mailing list