FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX)

Will Deacon will.deacon at arm.com
Fri Dec 11 05:33:14 PST 2015


On Fri, Dec 11, 2015 at 01:26:47PM +0100, Peter Zijlstra wrote:
> On Fri, Dec 11, 2015 at 12:18:00PM +0000, Will Deacon wrote:
> > On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote:
> > > On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote:
> > > > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
> > > > as opposed to "compare and swap". In which case, it does look like
> > > > there's a bug here because there is nothing to order the initialisation
> > > > of the node fields with publishing of the node, whether that's
> > > > indirectly as a result of setting the tail to the current CPU or
> > > > directly as a result of the WRITE_ONCE.
> > > 
> > > Agreed, this does indeed look like a bug. If confirmed please write a
> > > shiny changelog and I'll queue asap.
> > 
> > Yup. I've failed to reproduce the issue locally, so we'll need to wait
> > for Andrew and/or David to get back to us first.
> 
> While we're there, the acquire in osq_wait_next() seems somewhat ill
> documented too.
> 
> I _think_ we need ACQUIRE semantics there because we want to strictly
> order the lock-unqueue A,B,C steps and we get that with:
> 
>  A: SC
>  B: ACQ
>  C: Relaxed
> 
> Similarly for unlock we want the WRITE_ONCE to happen after
> osq_wait_next, but in that case we can even rely on the control
> dependency there.

Even for the lock-unqueue case, isn't B->C ordered by a control dependency
because C consists only of stores?

Will



More information about the linux-arm-kernel mailing list