[PATCH] arm64: Use load LSE atomics for the non-return per-CPU atomic operations

Christoph Lameter (Ampere) cl at gentwo.org
Wed Nov 12 09:25:28 PST 2025


On Thu, 6 Nov 2025, Catalin Marinas wrote:

> STADD executed back to back as in srcu_read_{lock,unlock}*() incur an
> additional overhead due to the default posting behaviour on several CPU
> implementations. Since the per-CPU atomics are unlikely to be used
> concurrently on the same memory location, encourage the hardware to to
> execute them "near" by issuing load atomics - LDADD/LDCLR/LDSET - with
> the destination register unused (but not XZR).

Far atomics would evict the data from l1 of the current cpu, I think.




More information about the linux-arm-kernel mailing list