[PATCH v9 01/12] asm-generic: barrier: Add smp_cond_load_relaxed_timeout()
David Laight
david.laight.linux at gmail.com
Thu Feb 12 01:56:21 PST 2026
On Sun, 8 Feb 2026 18:31:42 -0800
Ankur Arora <ankur.a.arora at oracle.com> wrote:
> Add smp_cond_load_relaxed_timeout(), which extends
> smp_cond_load_relaxed() to allow waiting for a duration.
>
> We loop around waiting for the condition variable to change while
> peridically doing a time-check. The loop uses cpu_poll_relax() to slow
> down the busy-waiting, which, unless overridden by the architecture
> code, amounts to a cpu_relax().
>
> Note that there are two ways for the time-check to fail: the usual
> timeout case or, @time_expr_ns returning an invalid value (negative
> or zero). The second failure mode allows for clocks attached to the
> clock-domain of @cond_expr, which might cease to operate meaningfully
> once some state internal to @cond_expr has changed.
>
> Evaluation of @time_expr_ns: in the fastpath we want to keep the
> performance close to smp_cond_load_relaxed(). To do that we defer
> evaluation of the potentially costly @time_expr_ns to when we hit
> the slowpath.
>
> This also means that there will always be some hardware dependent
> duration that has passed in cpu_poll_relax() iterations at the time of
> first evaluation. Additionally cpu_poll_relax() is not guaranteed to
> return at timeout boundary. In sum, expect timeout overshoot when we
> exit due to expiration of the timeout.
>
> The number of spin iterations before time-check, SMP_TIMEOUT_POLL_COUNT
> is chosen to be 200 by default. With a cpu_poll_relax() iteration
> taking ~20-30 cycles (measured on a variety of x86 platforms), we expect
> a tim-check every ~4000-6000 cycles.
^ time-check
Plus the cost of evaluating cond_expr 200 times.
I guess that isn't expected to contain a PCIe read :-)
David
>
> The outer limit of the overshoot is double that when working with the
> parameters above. This might be higher or lower depending on the
> implementation of cpu_poll_relax() across architectures.
>
> Lastly, config option ARCH_HAS_CPU_RELAX indicates availability of a
> cpu_poll_relax() that is cheaper than polling. This might be relevant
> for cases with a prolonged timeout.
>
> Cc: Arnd Bergmann <arnd at arndb.de>
> Cc: Will Deacon <will at kernel.org>
> Cc: Catalin Marinas <catalin.marinas at arm.com>
> Cc: Peter Zijlstra <peterz at infradead.org>
> Cc: linux-arch at vger.kernel.org
> Signed-off-by: Ankur Arora <ankur.a.arora at oracle.com>
> ---
> Notes:
> - Defer evaluation of @time_expr_ns to when we hit the slowpath.
> - This also helps get rid of the labelled gotos which were used to
> handle the early failure case (since now there's no early init
> to be concerned with.)
> - Add a comment mentioning that the cpu_poll_relax() implementation
> is better than polling if ARCH_HAS_CPU_RELAX.
>
> include/asm-generic/barrier.h | 72 +++++++++++++++++++++++++++++++++++
> 1 file changed, 72 insertions(+)
>
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index d4f581c1e21d..2738fe35c1df 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -273,6 +273,68 @@ do { \
> })
> #endif
>
> +/*
> + * Number of times we iterate in the loop before doing the time check.
> + */
> +#ifndef SMP_TIMEOUT_POLL_COUNT
> +#define SMP_TIMEOUT_POLL_COUNT 200
> +#endif
> +
> +/*
> + * Platforms with ARCH_HAS_CPU_RELAX have a cpu_poll_relax() implementation
> + * that is expected to be cheaper (lower power) than pure polling.
> + */
> +#ifndef cpu_poll_relax
> +#define cpu_poll_relax(ptr, val, timeout_ns) cpu_relax()
> +#endif
> +
> +/**
> + * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
> + * guarantees until a timeout expires.
> + * @ptr: pointer to the variable to wait on.
> + * @cond: boolean expression to wait for.
> + * @time_expr_ns: expression that evaluates to monotonic time (in ns) or,
> + * on failure, returns a negative value.
> + * @timeout_ns: timeout value in ns
> + * Both of the above are assumed to be compatible with s64; the signed
> + * value is used to handle the failure case in @time_expr_ns.
> + *
> + * Equivalent to using READ_ONCE() on the condition variable.
> + *
> + * Callers that expect to wait for prolonged durations might want to
> + * take into account the availability of ARCH_HAS_CPU_RELAX.
> + */
> +#ifndef smp_cond_load_relaxed_timeout
> +#define smp_cond_load_relaxed_timeout(ptr, cond_expr, \
> + time_expr_ns, timeout_ns) \
> +({ \
> + typeof(ptr) __PTR = (ptr); \
> + __unqual_scalar_typeof(*ptr) VAL; \
> + u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT; \
> + s64 __timeout = (s64)timeout_ns; \
> + s64 __time_now, __time_end = 0; \
> + \
> + for (;;) { \
> + VAL = READ_ONCE(*__PTR); \
> + if (cond_expr) \
> + break; \
> + cpu_poll_relax(__PTR, VAL, (u64)__timeout); \
> + if (++__n < __spin) \
> + continue; \
> + __time_now = (s64)(time_expr_ns); \
> + if (unlikely(__time_end == 0)) \
> + __time_end = __time_now + __timeout; \
> + __timeout = __time_end - __time_now; \
> + if (__time_now <= 0 || __timeout <= 0) { \
> + VAL = READ_ONCE(*__PTR); \
> + break; \
> + } \
> + __n = 0; \
> + } \
> + (typeof(*ptr))VAL; \
> +})
> +#endif
> +
> /*
> * pmem_wmb() ensures that all stores for which the modification
> * are written to persistent storage by preceding instructions have
More information about the linux-arm-kernel
mailing list