Runqueue spinlock recursion on arm64 v4.15

Mon Feb 5 05:36:00 PST 2018

On Fri, Feb 02, 2018 at 10:07:26PM +0000, Mark Rutland wrote:
> On Fri, Feb 02, 2018 at 08:55:06PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 02, 2018 at 07:27:04PM +0000, Mark Rutland wrote:
> > > ... in some cases, owner_cpu is -1, so I guess we're racing with an
> > > unlock. I only ever see this on the runqueue locks in wake up functions.
> > 
> > So runqueue locks are special in that the owner changes over a contex
> > switch, maybe something goes funny there?
> 
> Aha! I think that's it!
> 
> In finish_lock_switch() we do:
> 
> 	smp_store_release(&prev->on_cpu, 0);
> 	...
> 	rq->lock.owner = current;
> 
> As soon as we update prev->on_cpu, prev can be scheduled on another CPU, and
> can thus see a stale value for rq->lock.owner (e.g. if it tries to wake up
> another task on that rq).

I hacked in a forced vCPU preemption between the two using a sled of WFE
instructions, and now I can trigger the problem in seconds rather than
hours.

With the patch below applied, things seem to fine so far.

So I'm pretty sure this is it. I'll clean up the patch text and resend
that in a bit.

Thanks,
Mark.

> I guess the below (completely untested) would fix that. I'll try to give it a
> go next week.
> 
> Thanks,
> Mark.
> 
> ---->8----
> From 3ef7e7466d09d2270d2a353d032ff0f9bc0488a7 Mon Sep 17 00:00:00 2001
> From: Mark Rutland <mark.rutland at arm.com>
> Date: Fri, 2 Feb 2018 21:56:17 +0000
> Subject: [PATCH] sched/core: avoid spurious spinlock recursion splats
> 
> The runqueue locks are special in that the owner changes over a context switch.
> To ensure that this is account for in CONFIG_DEBUG_SPINLOCK builds,
> finish_lock_switch updates rq->lock.owner while the lock is held.
> 
> However, this happens *after* prev->on_cpu is cleared, which allows prev to be
> scheduled on another CPU. If prev then attempts to acquire the same rq lock, it
> will see itself as the owner.
> 
> This can be seen in virtual environments, where the vCPU can be preempted for
> an arbitrarily long period between updating prev->on_cpu and rq->lock.owner.
> 
> We can avoid this issue by updating rq->lock.owner first. The release of
> prev->on_cpu will ensure that the new owner is visible to prev if it is
> scheduled on another CPU.
> 
> Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> Cc: Peter Zijlstra <peterz at infradead.org>
> Cc: Ingo Molnar <mingo at kernel.org>
> ---
>  kernel/sched/sched.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index b19552a212de..4f0d2e3701c3 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1342,6 +1342,10 @@ static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
>  
>  static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
>  {
> +#ifdef CONFIG_DEBUG_SPINLOCK
> +	/* this is a valid case when another task releases the spinlock */
> +	rq->lock.owner = current;
> +#endif
>  #ifdef CONFIG_SMP
>  	/*
>  	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
> @@ -1355,10 +1359,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
>  	 */
>  	smp_store_release(&prev->on_cpu, 0);
>  #endif
> -#ifdef CONFIG_DEBUG_SPINLOCK
> -	/* this is a valid case when another task releases the spinlock */
> -	rq->lock.owner = current;
> -#endif
>  	/*
>  	 * If we are tracking spinlock dependencies then we have to
>  	 * fix up the runqueue lock - which gets 'carried over' from
> -- 
> 2.11.0
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel