[PATCH 2/9] locking/qrwlock: avoid redundant atomic_add_return on read_lock_slowpath
Peter Zijlstra
peterz at infradead.org
Tue Jul 7 14:30:01 PDT 2015
On Tue, Jul 07, 2015 at 01:51:54PM -0400, Waiman Long wrote:
> >- cnts = atomic_add_return(_QR_BIAS,&lock->cnts) - _QR_BIAS;
> >+ atomic_add(_QR_BIAS,&lock->cnts);
> >+ cnts = smp_load_acquire((u32 *)&lock->cnts);
> > rspin_until_writer_unlock(lock, cnts);
> >
> > /*
>
> Atomic add in x86 is actually a full barrier too. The performance difference
> between "lock add" and "lock xadd" should be minor. The additional load,
> however, could potentially cause an additional cacheline load on a contended
> lock. So do you see actual performance benefit of this change in ARM?
Yes, atomic_add() does not imply (and does not have) any memory barriers
on ARM.
More information about the linux-arm-kernel
mailing list