[PATCH RFC] Avoid memory barrier in read_seqcount() through load acquire

Linus Torvalds torvalds at linux-foundation.org
Mon Aug 19 09:25:45 PDT 2024


On Mon, 19 Aug 2024 at 01:46, Mark Rutland <mark.rutland at arm.com> wrote:
>
> If you cannot disclose that for some reason, just say "on my ARM64 test
> machine" or something like that, so that we're not implying that this is
> true for all ARM64 implementations.

It's the same machine I have - an Ampere Altra. It's a standard
Neoverse N1 core, afaik.

It might also be a good idea to just point to the ARM documentation,
although I don't know how stable those web addresses are:

   https://developer.arm.com/documentation/102336/0100/Load-Acquire-and-Store-Release-instructions

and quoting the relevant part on that page:

 "Weaker ordering requirements that are imposed by Load-Acquire and
  Store-Release instructions allow for micro-architectural
  optimizations, which could reduce some of the performance impacts that
  are otherwise imposed by an explicit memory barrier.

  If the ordering requirement is satisfied using either a Load-Acquire
  or Store-Release, then it would be preferable to use these
  instructions instead of a DMB"

where that last sentence is basically ARM saying that load-acquire is
better than load+DMB and should be preferred.

             Linus



More information about the linux-arm-kernel mailing list