LSE atomic op ordering is weaker than intended?
Hector Martin
marcan at marcan.st
Wed Mar 3 13:05:19 GMT 2021
Hi Will and everyone else,
While yak shaving the AIC driver ordering minutiae, I came across this.
atomic_t.txt describes "fully ordered" atomic ops as follows:
> Fully ordered primitives are ordered against everything prior and
> everything subsequent. Therefore a fully ordered primitive is like
> having an smp_mb() before and an smp_mb() after the primitive.
And among those ops are the atomic_fetch_* ops. These are implemented as
e.g. LDSETAL, with acquire-release semantics.
However, the *AL LSE ops have acquire semantics *for the read* and
release semantics *for the write*. As independent components of the same
atomic op, I cannot find anything in the ARM ARM that would imply
ordering between the Load-Acquire and *prior* memory operations, nor
ordering between the Store-Release and *subsequent* memory operations.
So it would seem these ops are not in fact fully ordered, but rather,
only order the read component against prior ops, and the write component
against subsequent ops.
Put another way: the current implementation means that unqualified ops
are equal to _acquire + _release semantics as they are described in
atomic_t.txt, but that is weaker than "fully ordered".
Throwing this litmus test at herd7 seems to confirm this theory:
AArch64 lse-atomic-al-ops-are-not-fully-ordered
""
{
0:X1=x; 0:X3=y;
1:X1=x; 1:X3=y;
}
P0 | P1 ;
MOV X0, #1 | MOV X0, #1 ;
LDSETAL X0, X2, [X1] | LDSETAL X0, X2, [X3];
LDR X4, [X3] | LDR X4, [X1] ;
exists (0:X4=0 /\ 1:X4=0)
The positive result goes away adding a DMB ISH (i.e. smp_mb()) after the
atomic ops, which contradicts the atomic_t.txt claim.
Did I miss something, or is this in fact an issue?
(And while I'm talking to the right people: this issue aside, do atomic
ops on Normal memory create ordering with Device memory ops, or are
there no guarantees there due to the fact that Normal memory is mapped
inner-shareable and the ordering guarantees thus do not extend to
outer-shareable Device accesses? My currenty understanding is the
latter, but I find the ARM ARM wording hard to conclusively grok here.)
--
Hector Martin (marcan at marcan.st)
Public Key: https://mrcn.st/pub
More information about the linux-arm-kernel
mailing list