[PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
Ankur Arora
ankur.a.arora at oracle.com
Fri Aug 29 15:38:35 PDT 2025
Okanovic, Haris <harisokn at amazon.com> writes:
> On Fri, 2025-08-29 at 01:07 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Hi,
>>
>> This series adds waited variants of the smp_cond_load() primitives:
>> smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
>>
>> Why?: as the name suggests, the new interfaces are meant for contexts
>> where you want to wait on a condition variable for a finite duration.
>> This is easy enough to do with a loop around cpu_relax(). However,
>> some architectures (ex. arm64) also allow waiting on a cacheline. So,
>> these interfaces handle a mixture of spin/wait with a smp_cond_load()
>> thrown in.
>>
>> There are two known users for these interfaces:
>>
>> - poll_idle() [1]
>> - resilient queued spinlocks [2]
>>
>> The interfaces are:
>> smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr)
>> smp_cond_load_acquire_spinwait(ptr, cond_expr, time_check_expr)
>>
>> The added parameter, time_check_expr, determines the bail out condition.
>>
>> Changelog:
>> v3 [3]:
>> - further interface simplifications (suggested by Catalin Marinas)
>>
>> v2 [4]:
>> - simplified the interface (suggested by Catalin Marinas)
>> - get rid of wait_policy, and a multitude of constants
>> - adds a slack parameter
>> This helped remove a fair amount of duplicated code duplication and in hindsight
>> unnecessary constants.
>>
>> v1 [5]:
>> - add wait_policy (coarse and fine)
>> - derive spin-count etc at runtime instead of using arbitrary
>> constants.
>>
>> Haris Okanovic had tested an earlier version of this series with
>> poll_idle()/haltpoll patches. [6]
>>
>> Any comments appreciated!
>>
>> Thanks!
>> Ankur
>>
>> [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
>> [2] Uses the smp_cond_load_acquire_timewait() from v1
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
>> [3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/
>> [4] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/
>> [5] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com/
>> [6] https://lore.kernel.org/lkml/f2f5d09e79539754ced085ed89865787fa668695.camel@amazon.com
>>
>> Cc: Arnd Bergmann <arnd at arndb.de>
>> Cc: Will Deacon <will at kernel.org>
>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>> Cc: Peter Zijlstra <peterz at infradead.org>
>> Cc: Kumar Kartikeya Dwivedi <memxor at gmail.com>
>> Cc: Alexei Starovoitov <ast at kernel.org>
>> Cc: linux-arch at vger.kernel.org
>>
>> Ankur Arora (5):
>> asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
>> arm64: barrier: Add smp_cond_load_relaxed_timewait()
>> arm64: rqspinlock: Remove private copy of
>> smp_cond_load_acquire_timewait
>> asm-generic: barrier: Add smp_cond_load_acquire_timewait()
>> rqspinlock: use smp_cond_load_acquire_timewait()
>>
>> arch/arm64/include/asm/barrier.h | 22 ++++++++
>> arch/arm64/include/asm/rqspinlock.h | 84 +----------------------------
>> include/asm-generic/barrier.h | 57 ++++++++++++++++++++
>> include/asm-generic/rqspinlock.h | 4 ++
>> kernel/bpf/rqspinlock.c | 25 ++++-----
>> 5 files changed, 93 insertions(+), 99 deletions(-)
>>
>> --
>> 2.31.1
>>
>
> Tested on AWS Graviton 2, 3, and 4 (ARM64 Neoverse N1, V1, and V2) with
> your V10 haltpoll changes, atop 6.17.0-rc3 (commit 07d9df8008).
> Still seeing between 1.3x and 2.5x speedups in `perf bench sched pipe`
> and `seccomp-notify`; no change in `messaging`.
Great.
> Reviewed-by: Haris Okanovic <harisokn at amazon.com>
> Tested-by: Haris Okanovic <harisokn at amazon.com>
Thank you.
--
ankur
More information about the linux-arm-kernel
mailing list