[PATCH 1/2] ARM: spinlock: use ticket algorithm for ARMv6+ locking implementation
Nicolas Pitre
nico at fluxnic.net
Fri Jun 22 16:08:47 EDT 2012
On Fri, 22 Jun 2012, Will Deacon wrote:
> Ticket spinlocks ensure locking fairness by reducing the thundering herd
> effect when acquiring a lock.
The conventional thundering herd effect is still there as all waiting
CPUs are still woken up from WFE upon a spin_unlock with a SEV. Maybe
the fact that the actual spinning is no longer banging on the exclusion
monitor does count as reduction of the thundering herd effect at that
level though, and if that is what you meant then this could be precised
here.
> This is especially important on systems
> where memory-access times are not necessarily uniform when accessing
> the lock structure (for example, on a multi-cluster platform where the
> lock is allocated into L1 when a CPU releases it).
In which case it is more about fairness. This algorithm brings fairness
due to its FIFO nature, despite possible memory access speed
differences. And that is even more important than the thundering herd
effect. Solving both at once is of course all good.
> This patch implements the ticket spinlock algorithm for ARM, replacing
> the simpler implementation for ARMv6+ processors.
>
> Cc: Nicolas Pitre <nico at fluxnic.net>
> Signed-off-by: Will Deacon <will.deacon at arm.com>
A minor remarks below. Otherwise...
Reviewed-by: Nicolas Pitre <nico at linaro.org>
> ---
> arch/arm/include/asm/spinlock.h | 73 ++++++++++++++++++++++-----------
> arch/arm/include/asm/spinlock_types.h | 17 +++++++-
> 2 files changed, 64 insertions(+), 26 deletions(-)
>
> diff --git a/arch/arm/include/asm/spinlock.h b/arch/arm/include/asm/spinlock.h
> index 65fa3c8..dcca638 100644
> --- a/arch/arm/include/asm/spinlock.h
> +++ b/arch/arm/include/asm/spinlock.h
> @@ -79,31 +74,40 @@ static inline void dsb_sev(void)
> static inline void arch_spin_lock(arch_spinlock_t *lock)
> {
> unsigned long tmp;
> + u32 newval;
> + arch_spinlock_t lockval;
>
> __asm__ __volatile__(
> -"1: ldrex %0, [%1]\n"
> -" teq %0, #0\n"
> - WFE("ne")
> -" strexeq %0, %2, [%1]\n"
> -" teqeq %0, #0\n"
> +"1: ldrex %0, [%3]\n"
> +" add %1, %0, %4\n"
> +" strex %2, %1, [%3]\n"
> +" teq %2, #0\n"
> " bne 1b"
> - : "=&r" (tmp)
> - : "r" (&lock->lock), "r" (1)
> + : "=&r" (lockval), "=&r" (newval), "=&r" (tmp)
> + : "r" (&lock->slock), "I" (1 << TICKET_SHIFT)
> : "cc");
>
> + while (lockval.tickets.next != lockval.tickets.owner) {
> + wfe();
> + lockval.tickets.owner = ACCESS_ONCE(lock->tickets.owner);
> + }
> +
> smp_mb();
> }
>
> static inline int arch_spin_trylock(arch_spinlock_t *lock)
> {
> unsigned long tmp;
> + u32 slock;
>
> __asm__ __volatile__(
> -" ldrex %0, [%1]\n"
> -" teq %0, #0\n"
> -" strexeq %0, %2, [%1]"
> - : "=&r" (tmp)
> - : "r" (&lock->lock), "r" (1)
> +" ldrex %0, [%2]\n"
> +" cmp %0, %0, ror #16\n"
> +" movne %1, #1\n"
You could replace the above 2 insns with:
subs %1, %0, %0, ror #16
> +" addeq %0, %0, %3\n"
> +" strexeq %1, %0, [%2]"
> + : "=&r" (slock), "=&r" (tmp)
> + : "r" (&lock->slock), "I" (1 << TICKET_SHIFT)
> : "cc");
>
> if (tmp == 0) {
Nicolas
More information about the linux-arm-kernel
mailing list