RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

Paul E. McKenney paulmck at linux.vnet.ibm.com
Wed Aug 16 05:56:17 PDT 2017


On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck at linux.vnet.ibm.com> writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> > Date:   Mon Aug 14 08:54:39 2017 -0700
> >
> >     EXP: Trace tick return from tick_nohz_stop_sched_tick
> >     
> >     Signed-off-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> >  	 * (not only the tick).
> >  	 */
> >  	ts->sleep_length = ktime_sub(dev->next_event, now);
> > +	trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> >  	return tick;
> >  }
> 
> Should I be seeing negative values? A small sample:

Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large.  1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.

>           <idle>-0     [015] d...  1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
>           <idle>-0     [009] d...  1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [007] d...  1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
>           <idle>-0     [048] d...  1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
>           <idle>-0     [006] d...  1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
>           <idle>-0     [001] d...  1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
>           <idle>-0     [008] d...  1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
>           <idle>-0     [006] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [009] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>           <idle>-0     [001] d...  1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> 
> 
> I have a full trace, I'll send it to you off-list.

I will take a look!

							Thanx, Paul




More information about the linux-arm-kernel mailing list