RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Wed Aug 16 05:56:17 PDT 2017
On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney" <paulmck at linux.vnet.ibm.com> writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> > Date: Mon Aug 14 08:54:39 2017 -0700
> >
> > EXP: Trace tick return from tick_nohz_stop_sched_tick
> >
> > Signed-off-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> > * (not only the tick).
> > */
> > ts->sleep_length = ktime_sub(dev->next_event, now);
> > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
> > return tick;
> > }
>
> Should I be seeing negative values? A small sample:
Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large. 1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.
> <idle>-0 [015] d... 1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019
> <idle>-0 [009] d... 1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> <idle>-0 [007] d... 1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025
> <idle>-0 [048] d... 1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973
> <idle>-0 [006] d... 1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027
> <idle>-0 [001] d... 1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053
> <idle>-0 [008] d... 1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055
> <idle>-0 [006] d... 1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> <idle>-0 [009] d... 1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
> <idle>-0 [001] d... 1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018
>
>
> I have a full trace, I'll send it to you off-list.
I will take a look!
Thanx, Paul
More information about the linux-arm-kernel
mailing list