FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX)

Will Deacon will.deacon at arm.com
Fri Dec 11 04:18:00 PST 2015


On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote:
> On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote:
> > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
> > as opposed to "compare and swap". In which case, it does look like
> > there's a bug here because there is nothing to order the initialisation
> > of the node fields with publishing of the node, whether that's
> > indirectly as a result of setting the tail to the current CPU or
> > directly as a result of the WRITE_ONCE.
> 
> Agreed, this does indeed look like a bug. If confirmed please write a
> shiny changelog and I'll queue asap.

Yup. I've failed to reproduce the issue locally, so we'll need to wait
for Andrew and/or David to get back to us first.

Will

> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index d092a0c9c2d4..05a37857ab55 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -93,10 +93,12 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> >  	node->cpu = curr;
> >  
> >  	/*
> > -	 * ACQUIRE semantics, pairs with corresponding RELEASE
> > -	 * in unlock() uncontended, or fastpath.
> > +	 * We need both ACQUIRE (pairs with corresponding RELEASE in
> > +	 * unlock() uncontended, or fastpath) and RELEASE (to publish
> > +	 * the node fields we just initialised) semantics when updating
> > +	 * the lock tail.
> >  	 */
> > -	old = atomic_xchg_acquire(&lock->tail, curr);
> > +	old = atomic_xchg(&lock->tail, curr);
> >  	if (old == OSQ_UNLOCKED_VAL)
> >  		return true;
> >  
> 



More information about the linux-arm-kernel mailing list