RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?
Michael Ellerman
mpe at ellerman.id.au
Thu Aug 17 06:55:51 PDT 2017
"Paul E. McKenney" <paulmck at linux.vnet.ibm.com> writes:
> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
>> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
>> > "Paul E. McKenney" <paulmck at linux.vnet.ibm.com> writes:
>> > ...
>> > >
>> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
>> > > Author: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
>> > > Date: Mon Aug 14 08:54:39 2017 -0700
>> > >
>> > > EXP: Trace tick return from tick_nohz_stop_sched_tick
>> > >
>> > > Signed-off-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
>> > >
>> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> > > index c7a899c5ce64..7358a5073dfb 100644
>> > > --- a/kernel/time/tick-sched.c
>> > > +++ b/kernel/time/tick-sched.c
>> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>> > > * (not only the tick).
>> > > */
>> > > ts->sleep_length = ktime_sub(dev->next_event, now);
>> > > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000);
>> > > return tick;
>> > > }
>> >
>> > Should I be seeing negative values? A small sample:
>>
>> Maybe due to hypervisor preemption delays, but I confess that I am
>> surprised to see them this large. 1,602,250,019 microseconds is something
>> like a half hour, which could result in stall warnings all by itself.
Hmm. This is a bare metal machine. So no hypervisor.
>> I will take a look!
>
> And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
> kthread that stalled. This kthread was starved, based on this from your
> dmesg:
>
> [ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
>
> The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
> idle-CPU and offline-CPU quiescent states, which means that its waits
> will be accompanied by short timeouts. The "starved for 2603 jiffies"
> says that it has not run for one good long time. The ->state is its
> task_struct ->state field.
>
> The immediately preceding dmesg line is as follows:
>
> [ 1602.063851] (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)
>
> In other words, the rcu_sched grace-period kthread has been starved
> for the entire duration of the current grace period, as shown by the
> t=2603.
>
> Lets turn now to the trace output, looking for the last bit of the
> rcu_sched task's activity:
>
> rcu_sched-9 [054] d... 1576.030096: timer_start: timer=c0000007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] cpu=54 idx=0 flags=
> ksoftirqd/53-276 [053] ..s. 1576.030097: rcu_invoke_callback: rcu_sched rhp=c000000fcf8c4eb0 func=__d_free
> rcu_sched-9 [054] d... 1576.030097: rcu_utilization: Start context switch
> ksoftirqd/53-276 [053] ..s. 1576.030098: rcu_invoke_callback: rcu_sched rhp=c000000fcff74ee0 func=proc_i_callback
> rcu_sched-9 [054] d... 1576.030098: rcu_grace_period: rcu_sched 7275 cpuqs
> rcu_sched-9 [054] d... 1576.030099: rcu_utilization: End context switch
>
> So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
> but what is with "expires=4295094922"?)
That's a good one.
I have HZ=100, and therefore:
INITIAL_JIFFIES = (1 << 32) - (300 * 100) = 4294937296
So the expires value of 4295094922 is:
4295094922 - 4294937296 = 157626
Jiffies since boot.
Or 1576,260,000,000 ns == 1576.26 s.
> Of course, the timer will have expired in the context of some other task,
> but a search for "c0000007fae1bc20" (see the "timer=" in the first trace
> line above) shows nothing (to be painfully accurate, the search wraps back
> to earlier uses of this timer by rcu_sched). So the timer never did fire.
Or it just wasn't in the trace ?
I'll try and get it to trace a bit longer and see if that is helpful.
cheers
More information about the linux-arm-kernel
mailing list