[PATCH v2 3/3] arm64: spinlock: use lock->owner to optimise spin_unlock_wait

Fri Jun 10 05:46:23 PDT 2016

On Fri, Jun 10, 2016 at 02:25:20PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 08, 2016 at 05:25:39PM +0100, Will Deacon wrote:
> > Rather than wait until we observe the lock being free, we can also
> > return from spin_unlock_wait if we observe that the lock is now held
> > by somebody else, which implies that it was unlocked but we just missed
> > seeing it in that state.
> > 
> > Furthermore, in such a scenario there is no longer a need to write back
> > the value that we loaded, since we know that there has been a lock
> > hand-off, which is sufficient to publish any stores prior to the
> > unlock_wait.
> 
> You might want a few words on _why_ here. It took me a little while to
> figure that out.

How about "... because the ARM architecture ensures that a Store-Release
is multi-copy-atomic when observed by a Load-Acquire instruction"?

> Also; human readable arguments to support the thing below go a long way
> into validating the test is indeed correct. Because as you've shown,
> even the validators cannot be trusted ;-)

Well, I didn't actually provide the output of a model here. I'm just
capturing the rationale in a non-ambiguous form.

> > The litmus test is something like:
> > 
> > AArch64
> > {
> > 0:X1=x; 0:X3=y;
> > 1:X1=y;
> > 2:X1=y; 2:X3=x;
> > }
> >  P0          | P1           | P2           ;
> >  MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
> >  STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
> >  DMB SY      |              |              ;
> >  LDR W2,[X3] |              |              ;
> > exists
> > (0:X2=0 /\ 2:X0=1 /\ 2:X2=0)
> > 
> > where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
> > doing spin_lock.
> 
> I still have a hard time deciphering these things..

I'll nail you down at LPC and share the kool-aid :)

Will