[PATCH v9 01/12] asm-generic: barrier: Add smp_cond_load_relaxed_timeout()

Ankur Arora ankur.a.arora at oracle.com
Fri Feb 13 20:58:08 PST 2026


David Laight <david.laight.linux at gmail.com> writes:

> On Sun,  8 Feb 2026 18:31:42 -0800
> Ankur Arora <ankur.a.arora at oracle.com> wrote:
>
>> Add smp_cond_load_relaxed_timeout(), which extends
>> smp_cond_load_relaxed() to allow waiting for a duration.
>>
>> We loop around waiting for the condition variable to change while
>> peridically doing a time-check. The loop uses cpu_poll_relax() to slow
>> down the busy-waiting, which, unless overridden by the architecture
>> code, amounts to a cpu_relax().
>>
>> Note that there are two ways for the time-check to fail: the usual
>> timeout case or, @time_expr_ns returning an invalid value (negative
>> or zero). The second failure mode allows for clocks attached to the
>> clock-domain of @cond_expr, which might cease to operate meaningfully
>> once some state internal to @cond_expr has changed.
>>
>> Evaluation of @time_expr_ns: in the fastpath we want to keep the
>> performance close to smp_cond_load_relaxed(). To do that we defer
>> evaluation of the potentially costly @time_expr_ns to when we hit
>> the slowpath.
>>
>> This also means that there will always be some hardware dependent
>> duration that has passed in cpu_poll_relax() iterations at the time of
>> first evaluation. Additionally cpu_poll_relax() is not guaranteed to
>> return at timeout boundary. In sum, expect timeout overshoot when we
>> exit due to expiration of the timeout.
>>
>> The number of spin iterations before time-check, SMP_TIMEOUT_POLL_COUNT
>> is chosen to be 200 by default. With a cpu_poll_relax() iteration
>> taking ~20-30 cycles (measured on a variety of x86 platforms), we expect
>> a tim-check every ~4000-6000 cycles.
>     ^ time-check

Ugh. Thanks.

> Plus the cost of evaluating cond_expr 200 times.
> I guess that isn't expected to contain a PCIe read :-)

:). Good point. I'll see if I can add something like "when polling on
a memory address".


Ankur


>>
>> The outer limit of the overshoot is double that when working with the
>> parameters above. This might be higher or lower depending on the
>> implementation of cpu_poll_relax() across architectures.
>>
>> Lastly, config option ARCH_HAS_CPU_RELAX indicates availability of a
>> cpu_poll_relax() that is cheaper than polling. This might be relevant
>> for cases with a prolonged timeout.
>>
>> Cc: Arnd Bergmann <arnd at arndb.de>
>> Cc: Will Deacon <will at kernel.org>
>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>> Cc: Peter Zijlstra <peterz at infradead.org>
>> Cc: linux-arch at vger.kernel.org
>> Signed-off-by: Ankur Arora <ankur.a.arora at oracle.com>
>> ---
>> Notes:
>>   - Defer evaluation of @time_expr_ns to when we hit the slowpath.
>>   - This also helps get rid of the labelled gotos which were used to
>>     handle the early failure case (since now there's no early init
>>     to be concerned with.)
>>   - Add a comment mentioning that the cpu_poll_relax() implementation
>>     is better than polling if ARCH_HAS_CPU_RELAX.
>>
>>  include/asm-generic/barrier.h | 72 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 72 insertions(+)
>>
>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>> index d4f581c1e21d..2738fe35c1df 100644
>> --- a/include/asm-generic/barrier.h
>> +++ b/include/asm-generic/barrier.h
>> @@ -273,6 +273,68 @@ do {									\
>>  })
>>  #endif
>>
>> +/*
>> + * Number of times we iterate in the loop before doing the time check.
>> + */
>> +#ifndef SMP_TIMEOUT_POLL_COUNT
>> +#define SMP_TIMEOUT_POLL_COUNT		200
>> +#endif
>> +
>> +/*
>> + * Platforms with ARCH_HAS_CPU_RELAX have a cpu_poll_relax() implementation
>> + * that is expected to be cheaper (lower power) than pure polling.
>> + */
>> +#ifndef cpu_poll_relax
>> +#define cpu_poll_relax(ptr, val, timeout_ns)	cpu_relax()
>> +#endif
>> +
>> +/**
>> + * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
>> + * guarantees until a timeout expires.
>> + * @ptr: pointer to the variable to wait on.
>> + * @cond: boolean expression to wait for.
>> + * @time_expr_ns: expression that evaluates to monotonic time (in ns) or,
>> + *  on failure, returns a negative value.
>> + * @timeout_ns: timeout value in ns
>> + * Both of the above are assumed to be compatible with s64; the signed
>> + * value is used to handle the failure case in @time_expr_ns.
>> + *
>> + * Equivalent to using READ_ONCE() on the condition variable.
>> + *
>> + * Callers that expect to wait for prolonged durations might want to
>> + * take into account the availability of ARCH_HAS_CPU_RELAX.
>> + */
>> +#ifndef smp_cond_load_relaxed_timeout
>> +#define smp_cond_load_relaxed_timeout(ptr, cond_expr,			\
>> +				      time_expr_ns, timeout_ns)		\
>> +({									\
>> +	typeof(ptr) __PTR = (ptr);					\
>> +	__unqual_scalar_typeof(*ptr) VAL;				\
>> +	u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;			\
>> +	s64 __timeout = (s64)timeout_ns;				\
>> +	s64 __time_now, __time_end = 0;					\
>> +									\
>> +	for (;;) {							\
>> +		VAL = READ_ONCE(*__PTR);				\
>> +		if (cond_expr) 						\
>> +			break;						\
>> +		cpu_poll_relax(__PTR, VAL, (u64)__timeout);		\
>> +		if (++__n < __spin)					\
>> +			continue;					\
>> +		__time_now = (s64)(time_expr_ns);			\
>> +		if (unlikely(__time_end == 0))				\
>> +			__time_end = __time_now + __timeout;		\
>> +		__timeout = __time_end - __time_now;			\
>> +		if (__time_now <= 0 || __timeout <= 0) {		\
>> +			VAL = READ_ONCE(*__PTR);			\
>> +			break;						\
>> +		}							\
>> +		__n = 0;						\
>> +	}								\
>> +	(typeof(*ptr))VAL;						\
>> +})
>> +#endif
>> +
>>  /*
>>   * pmem_wmb() ensures that all stores for which the modification
>>   * are written to persistent storage by preceding instructions have


--
ankur



More information about the linux-arm-kernel mailing list