[PATCH] arm64: spinlock: serialise spin_unlock_wait against concurrent lockers
boqun.feng at gmail.com
Sun Dec 6 00:16:17 PST 2015
On Thu, Dec 03, 2015 at 09:22:07AM -0800, Paul E. McKenney wrote:
> On Thu, Dec 03, 2015 at 04:32:43PM +0000, Will Deacon wrote:
> > Hi Peter, Paul,
> > Firstly, thanks for writing that up. I agree that you have something
> > that can work in theory, but see below.
> > On Thu, Dec 03, 2015 at 02:28:39PM +0100, Peter Zijlstra wrote:
> > > On Wed, Dec 02, 2015 at 04:11:41PM -0800, Paul E. McKenney wrote:
> > > > This looks architecture-agnostic to me:
> > > >
> > > > a. TSO systems have smp_mb__after_unlock_lock() be a no-op, and
> > > > have a read-only implementation for spin_unlock_wait().
> > > >
> > > > b. Small-scale weakly ordered systems can also have
> > > > smp_mb__after_unlock_lock() be a no-op, but must instead
> > > > have spin_unlock_wait() acquire the lock and immediately
> > > > release it, or some optimized implementation of this.
> > > >
> > > > c. Large-scale weakly ordered systems are required to define
> > > > smp_mb__after_unlock_lock() as smp_mb(), but can have a
> > > > read-only implementation of spin_unlock_wait().
> > >
> > > This would still require all relevant spin_lock() sites to be annotated
> > > with smp_mb__after_unlock_lock(), which is going to be a painful (no
> > > warning when done wrong) exercise and expensive (added MBs all over the
> > > place).
> On the lack of warning, agreed, but please see below. On the added MBs,
> the only alternative I have been able to come up with has even more MBs,
> as in on every lock acquisition. If I am missing something, please do
> not keep it a secret!
Maybe we can treat this problem as a problem of data accesses other than
one of locks?
Let's take the example of tsk->flags in do_exit() and tsk->pi_lock, we
don't need to add a full barrier for every lock acquisition of
->pi_lock, because some critical sections of ->pi_lock don't access the
PF_EXITING bit of ->flags at all. What we only need is to add a full
barrier before reading the PF_EXITING bit in a critical section of
->pi_lock. To achieve this, we could introduce a primitive like
(on PPC and ARM64v8)
#define smp_load_in_lock(x, lock) \
(on other archs)
#define smp_load_in_lock(x, lock) READ_ONCE(x)
And call it every time we read a data which is not protected by the
current lock critical section but whose updaters synchronize with the
current lock critical section with spin_unlock_wait().
I admit the name may be bad and the second parameter @lock is for a way
to diagnosing the usage which I haven't come up with yet ;-)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 473 bytes
Desc: not available
More information about the linux-arm-kernel