[PATCH RFC] Avoid memory barrier in read_seqcount() through load acquire
Waiman Long
longman at redhat.com
Tue Aug 13 12:01:36 PDT 2024
On 8/13/24 14:26, Christoph Lameter via B4 Relay wrote:
> From: "Christoph Lameter (Ampere)" <cl at gentwo.org>
>
> Some architectures support load acquire which can save us a memory
> barrier and save some cycles.
>
> A typical sequence
>
> do {
> seq = read_seqcount_begin(&s);
> <something>
> } while (read_seqcount_retry(&s, seq);
>
> requires 13 cycles on ARM64 for an empty loop. Two read memory barriers are
> needed. One for each of the seqcount_* functions.
>
> We can replace the first read barrier with a load acquire of
> the seqcount which saves us one barrier.
>
> On ARM64 doing so reduces the cycle count from 13 to 8.
>
> Signed-off-by: Christoph Lameter (Ampere) <cl at gentwo.org>
> ---
> arch/Kconfig | 5 +++++
> arch/arm64/Kconfig | 1 +
> include/linux/seqlock.h | 41 +++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 975dd22a2dbd..3f8867110a57 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -1600,6 +1600,11 @@ config ARCH_HAS_KERNEL_FPU_SUPPORT
> Architectures that select this option can run floating-point code in
> the kernel, as described in Documentation/core-api/floating-point.rst.
>
> +config ARCH_HAS_ACQUIRE_RELEASE
> + bool
> + help
> + Architectures that support acquire / release can avoid memory fences
> +
> source "kernel/gcov/Kconfig"
>
> source "scripts/gcc-plugins/Kconfig"
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index a2f8ff354ca6..19e34fff145f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -39,6 +39,7 @@ config ARM64
> select ARCH_HAS_PTE_DEVMAP
> select ARCH_HAS_PTE_SPECIAL
> select ARCH_HAS_HW_PTE_YOUNG
> + select ARCH_HAS_ACQUIRE_RELEASE
> select ARCH_HAS_SETUP_DMA_OPS
> select ARCH_HAS_SET_DIRECT_MAP
> select ARCH_HAS_SET_MEMORY
Do we need a new ARCH flag? I believe barrier APIs like
smp_load_acquire() will use the full barrier for those arch'es that
don't define their own smp_load_acquire().
BTW, acquire/release can be considered memory barriers too. Maybe you
are talking about preferring acquire/release barriers over read/write
barriers. Right?
Cheers,
Longman
More information about the linux-arm-kernel
mailing list