[PATCH] arm64: Use load LSE atomics for the non-return per-CPU atomic operations

Will Deacon will at kernel.org
Fri Nov 7 07:53:43 PST 2025


On Thu, 06 Nov 2025 15:52:13 +0000, Catalin Marinas wrote:
> The non-return per-CPU this_cpu_*() atomic operations are implemented as
> STADD/STCLR/STSET when FEAT_LSE is available. On many microarchitecture
> implementations, these instructions tend to be executed "far" in the
> interconnect or memory subsystem (unless the data is already in the L1
> cache). This is in general more efficient when there is contention as it
> avoids bouncing cache lines between CPUs. The load atomics (e.g. LDADD
> without XZR as destination), OTOH, tend to be executed "near" with the
> data loaded into the L1 cache.
> 
> [...]

Applied to arm64 (for-next/fixes), thanks!

[1/1] arm64: Use load LSE atomics for the non-return per-CPU atomic operations
      https://git.kernel.org/arm64/c/535fdfc5a228

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev



More information about the linux-arm-kernel mailing list