am335x: 5.18.x: system stalling
Yegor Yefremov
yegorslists at googlemail.com
Tue May 31 01:36:20 PDT 2022
On Mon, May 30, 2022 at 5:15 PM Ard Biesheuvel <ardb at kernel.org> wrote:
>
> On Mon, 30 May 2022 at 15:54, Arnd Bergmann <arnd at arndb.de> wrote:
> >
> > On Sat, May 28, 2022 at 9:28 PM Yegor Yefremov
> > <yegorslists at googlemail.com> wrote:
> > >
> > > On Sat, May 28, 2022 at 3:14 PM Arnd Bergmann <arnd at arndb.de> wrote:
> > > >
> > > > On Sat, May 28, 2022 at 3:01 PM Yegor Yefremov
> > > > <yegorslists at googlemail.com> wrote:
> > > > > On Sat, May 28, 2022 at 11:07 AM Ard Biesheuvel <ardb at kernel.org> wrote:
> > > > > In file included from ./include/linux/irqflags.h:17,
> > > > > from ./arch/arm/include/asm/bitops.h:28,
> > > > > from ./include/linux/bitops.h:33,
> > > > > from ./include/linux/log2.h:12,
> > > > > from kernel/bounds.c:13:
> > > > > ./arch/arm/include/asm/percpu.h: In function ‘__my_cpu_offset’:
> > > > > ./arch/arm/include/asm/percpu.h:32:9: error: ‘__per_cpu_offset’
> > > > > undeclared (first use in this function); did you mean
> > > > > ‘__my_cpu_offset’?
> > > > > 32 | return __per_cpu_offset[0];
> > > > > | ^~~~~~~~~~~~~~~~
> > > > > | __my_cpu_offset
> > > > > ./arch/arm/include/asm/percpu.h:32:9: note: each undeclared identifier
> > > > > is reported only once for each function it appears in
> > > >
> > > > I think you just missed the line in my patch that adds the
> > > > "extern unsigned long __per_cpu_offset[];" variable declaration.
> > >
> > > So, I tried both variants and both led to stalls.
> >
> > I'm running out of ideas here. Going to back to the original bisection,
> > I rebased Ard's patches in a way that you should be able to build the
> > config for each patch, and I split up the "ARM: implement
> > THREAD_INFO_IN_TASK for uniprocessor systems" commit in yet
> > another way, hoping to get something left over that points to the
> > bug. Can you try bisecting through the top commits of
> >
> > https://kernel.org/pub/scm/linux/kernel/git/soc/soc.git am335x-stall-test
> >
> > starting maybe with "52d240871760 irqchip: nvic: Use
> > GENERIC_IRQ_MULTI_HANDLER" as the patch that is almost certainly
> > going to be ok?
> >
> > At some point I fear we may have to give up and just mark the v6+SMP
> > configuration as broken, which is something we have considered in the
> > past but ended up always keeping around for the purpose of testing
> > omap2plus_defconfig and imx_v6_v7_defconfig. Note that on production
> > systems you probably don't want to use that config anway, and should
> > either stick to a uniprocessor build, or disable the ARMv6 support.
> >
>
> Yeah, I am also running out of ideas. One question, though: does the
> RCU detected stall always occur in the same place? I.e., how similar
> are the backtraces of the stalls between different occurrences?
> Perhaps we could narrow down where in the code we are stalling, and
> gain some more understanding of the root cause.
I have attached 4 crash logs and will start with Arnd's branch bisecting.
Yegor
-------------- next part --------------
[ 219.721096] rcu: INFO: rcu_sched self-detected stall on CPU
[ 219.727845] rcu: 0-...!: (2600 ticks this GP) idle=e7d/1/0x40000004 softirq=3592/3592 fqs=0
[ 219.737376] (t=2600 jiffies g=5525 q=21)
[ 219.742051] rcu: rcu_sched kthread timer wakeup didn't happen for 2599 jiffies! g5525 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 219.753979] rcu: Possible timer handling issue on cpu=0 timer-softirq=2867
[ 219.761534] rcu: rcu_sched kthread starved for 2600 jiffies! g5525 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 219.772512] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 219.782043] rcu: RCU grace-period kthread stack dump:
[ 219.787605] task:rcu_sched state:I stack: 0 pid: 11 ppid: 2 flags:0x00000000
[ 219.797138] __schedule from schedule+0x58/0xcc
[ 219.802763] schedule from schedule_timeout+0x78/0xf8
[ 219.808847] schedule_timeout from rcu_gp_fqs_loop+0x108/0x3d0
[ 219.815741] rcu_gp_fqs_loop from rcu_gp_kthread+0xa8/0x134
[ 219.822273] rcu_gp_kthread from kthread+0xe4/0x104
[ 219.828121] kthread from ret_from_fork+0x14/0x28
[ 219.833664] Exception stack(0xd0041fb0 to 0xd0041ff8)
[ 219.839459] 1fa0: 00000000 00000000 00000000 00000000
[ 219.848426] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 219.857325] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 219.864572] rcu: Stack dump where RCU GP kthread last ran:
[ 219.870636] NMI backtrace for cpu 0
[ 219.874702] CPU: 0 PID: 58 Comm: kworker/0:8 Tainted: G W 5.18.0-rc7 #14
[ 219.883491] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 219.890214] Workqueue: events dbs_work_handler
[ 219.895659] unwind_backtrace from show_stack+0x10/0x14
[ 219.901897] show_stack from dump_stack_lvl+0x58/0x70
[ 219.908005] dump_stack_lvl from nmi_cpu_backtrace+0xe0/0x128
[ 219.914814] nmi_cpu_backtrace from nmi_trigger_cpumask_backtrace+0xec/0x184
[ 219.922875] nmi_trigger_cpumask_backtrace from trigger_single_cpu_backtrace+0x20/0x2c
[ 219.931828] trigger_single_cpu_backtrace from rcu_check_gp_kthread_starvation+0xf4/0x148
[ 219.941030] rcu_check_gp_kthread_starvation from rcu_sched_clock_irq+0xa98/0xf8c
[ 219.949573] rcu_sched_clock_irq from update_process_times+0x88/0xc0
[ 219.957001] update_process_times from tick_sched_handle+0x48/0x54
[ 219.964167] tick_sched_handle from tick_sched_timer+0x48/0xac
[ 219.970891] tick_sched_timer from __hrtimer_run_queues+0x250/0x4e4
[ 219.978144] __hrtimer_run_queues from hrtimer_interrupt+0x128/0x2c8
[ 219.985517] hrtimer_interrupt from dmtimer_clockevent_interrupt+0x24/0x2c
[ 219.993529] dmtimer_clockevent_interrupt from __handle_irq_event_percpu+0x98/0x334
[ 220.002289] __handle_irq_event_percpu from handle_irq_event+0x38/0xc0
[ 220.009770] handle_irq_event from handle_level_irq+0xb4/0x1a8
[ 220.016630] handle_level_irq from handle_irq_desc+0x1c/0x2c
[ 220.023279] handle_irq_desc from generic_handle_arch_irq+0x2c/0x64
[ 220.030501] generic_handle_arch_irq from __irq_svc+0x90/0xbc
[ 220.037105] Exception stack(0xd0001f58 to 0xd0001fa0)
[ 220.042841] 1f40: c01015c8 00000000
[ 220.051805] 1f60: 0eaec000 00000000 fffffe00 600f0013 ffffffff d0385d5c 00000000 c3744a80
[ 220.060765] 1f80: 00000200 c3744a80 c208dcd8 d0001fa8 c01015c8 c01015d0 600f0113 ffffffff
[ 220.069580] __irq_svc from __do_softirq+0xa0/0x5fc
[ 220.075370] __do_softirq from __irq_exit_rcu+0x138/0x178
[ 220.081788] __irq_exit_rcu from irq_exit+0x8/0x28
[ 220.087557] irq_exit from call_with_stack+0x18/0x20
[ 220.093503] call_with_stack from __irq_svc+0x9c/0xbc
[ 220.099402] Exception stack(0xd0385d28 to 0xd0385d70)
[ 220.105218] 5d20: c208dd04 f9e00488 c2006940 c191a2fc c208dcc0 c208a680
[ 220.114175] 5d40: c208dcc0 c191a2fc 00000000 c208dcc0 00000005 c208dcd8 fffffff9 d0385d78
[ 220.123033] 5d60: c06d5e5c c06d5c60 600f0013 ffffffff
[ 220.128675] __irq_svc from _omap3_noncore_dpll_lock+0x14/0xc4
[ 220.135601] _omap3_noncore_dpll_lock from omap3_noncore_dpll_program+0x14c/0x5e4
[ 220.144176] omap3_noncore_dpll_program from clk_change_rate+0x238/0x4f8
[ 220.151871] clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
[ 220.159248] clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
[ 220.166215] clk_set_rate from _set_opp+0x214/0x528
[ 220.171991] _set_opp from dev_pm_opp_set_rate+0xec/0x228
[ 220.178264] dev_pm_opp_set_rate from __cpufreq_driver_target+0x580/0x6fc
[ 220.186075] __cpufreq_driver_target from od_dbs_update+0xb4/0x168
[ 220.193319] od_dbs_update from dbs_work_handler+0x2c/0x60
[ 220.199733] dbs_work_handler from process_one_work+0x284/0x72c
[ 220.206617] process_one_work from worker_thread+0x28/0x4b0
[ 220.213147] worker_thread from kthread+0xe4/0x104
[ 220.218844] kthread from ret_from_fork+0x14/0x28
[ 220.224350] Exception stack(0xd0385fb0 to 0xd0385ff8)
[ 220.230085] 5fa0: 00000000 00000000 00000000 00000000
[ 220.239020] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 220.247910] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 220.255832] NMI backtrace for cpu 0
[ 220.260006] CPU: 0 PID: 58 Comm: kworker/0:8 Tainted: G W 5.18.0-rc7 #14
[ 220.268798] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 220.275513] Workqueue: events dbs_work_handler
[ 220.280953] unwind_backtrace from show_stack+0x10/0x14
[ 220.287156] show_stack from dump_stack_lvl+0x58/0x70
[ 220.293215] dump_stack_lvl from nmi_cpu_backtrace+0xe0/0x128
[ 220.299984] nmi_cpu_backtrace from nmi_trigger_cpumask_backtrace+0xec/0x184
[ 220.308037] nmi_trigger_cpumask_backtrace from trigger_single_cpu_backtrace+0x20/0x2c
[ 220.316976] trigger_single_cpu_backtrace from rcu_dump_cpu_stacks+0xf8/0x1ec
[ 220.325082] rcu_dump_cpu_stacks from rcu_sched_clock_irq+0xab8/0xf8c
[ 220.332539] rcu_sched_clock_irq from update_process_times+0x88/0xc0
[ 220.339919] update_process_times from tick_sched_handle+0x48/0x54
[ 220.347059] tick_sched_handle from tick_sched_timer+0x48/0xac
[ 220.353781] tick_sched_timer from __hrtimer_run_queues+0x250/0x4e4
[ 220.361027] __hrtimer_run_queues from hrtimer_interrupt+0x128/0x2c8
[ 220.368390] hrtimer_interrupt from dmtimer_clockevent_interrupt+0x24/0x2c
[ 220.376332] dmtimer_clockevent_interrupt from __handle_irq_event_percpu+0x98/0x334
[ 220.385071] __handle_irq_event_percpu from handle_irq_event+0x38/0xc0
[ 220.392548] handle_irq_event from handle_level_irq+0xb4/0x1a8
[ 220.399389] handle_level_irq from handle_irq_desc+0x1c/0x2c
[ 220.406030] handle_irq_desc from generic_handle_arch_irq+0x2c/0x64
[ 220.413261] generic_handle_arch_irq from __irq_svc+0x90/0xbc
[ 220.419847] Exception stack(0xd0001f58 to 0xd0001fa0)
[ 220.425568] 1f40: c01015c8 00000000
[ 220.434531] 1f60: 0eaec000 00000000 fffffe00 600f0013 ffffffff d0385d5c 00000000 c3744a80
[ 220.443512] 1f80: 00000200 c3744a80 c208dcd8 d0001fa8 c01015c8 c01015d0 600f0113 ffffffff
[ 220.452310] __irq_svc from __do_softirq+0xa0/0x5fc
[ 220.458090] __do_softirq from __irq_exit_rcu+0x138/0x178
[ 220.464449] __irq_exit_rcu from irq_exit+0x8/0x28
[ 220.470214] irq_exit from call_with_stack+0x18/0x20
[ 220.476116] call_with_stack from __irq_svc+0x9c/0xbc
[ 220.482007] Exception stack(0xd0385d28 to 0xd0385d70)
[ 220.487816] 5d20: c208dd04 f9e00488 c2006940 c191a2fc c208dcc0 c208a680
[ 220.496775] 5d40: c208dcc0 c191a2fc 00000000 c208dcc0 00000005 c208dcd8 fffffff9 d0385d78
[ 220.505644] 5d60: c06d5e5c c06d5c60 600f0013 ffffffff
[ 220.511288] __irq_svc from _omap3_noncore_dpll_lock+0x14/0xc4
[ 220.518136] _omap3_noncore_dpll_lock from omap3_noncore_dpll_program+0x14c/0x5e4
[ 220.526717] omap3_noncore_dpll_program from clk_change_rate+0x238/0x4f8
[ 220.534368] clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
[ 220.541751] clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
[ 220.548696] clk_set_rate from _set_opp+0x214/0x528
[ 220.554436] _set_opp from dev_pm_opp_set_rate+0xec/0x228
[ 220.560702] dev_pm_opp_set_rate from __cpufreq_driver_target+0x580/0x6fc
[ 220.568481] __cpufreq_driver_target from od_dbs_update+0xb4/0x168
[ 220.575706] od_dbs_update from dbs_work_handler+0x2c/0x60
[ 220.582161] dbs_work_handler from process_one_work+0x284/0x72c
[ 220.589012] process_one_work from worker_thread+0x28/0x4b0
[ 220.595530] worker_thread from kthread+0xe4/0x104
[ 220.601208] kthread from ret_from_fork+0x14/0x28
[ 220.606707] Exception stack(0xd0385fb0 to 0xd0385ff8)
[ 220.612461] 5fa0: 00000000 00000000 00000000 00000000
[ 220.621380] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 220.630266] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
-------------- next part --------------
[ 79.751404] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 79.758633] (detected by 0, t=2602 jiffies, g=4697, q=16429)
[ 79.765139] rcu: All QSes seen, last rcu_sched kthread activity 2602 (-22026--24628), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 79.777563] rcu: rcu_sched kthread starved for 2602 jiffies! g4697 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 79.788374] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 79.797901] rcu: RCU grace-period kthread stack dump:
[ 79.803469] task:rcu_sched state:R running task stack: 0 pid: 11 ppid: 2 flags:0x00000000
[ 79.814789] __schedule from schedule+0x58/0xcc
[ 79.820464] schedule from schedule_timeout+0x78/0xf8
[ 79.826524] schedule_timeout from rcu_gp_fqs_loop+0x108/0x3d0
[ 79.833419] rcu_gp_fqs_loop from rcu_gp_kthread+0xa8/0x134
[ 79.839968] rcu_gp_kthread from kthread+0xe4/0x104
[ 79.845802] kthread from ret_from_fork+0x14/0x28
[ 79.851344] Exception stack(0xd0041fb0 to 0xd0041ff8)
[ 79.857137] 1fa0: 00000000 00000000 00000000 00000000
[ 79.866093] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 79.875005] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 79.882257] rcu: Stack dump where RCU GP kthread last ran:
[ 79.888306] NMI backtrace for cpu 0
[ 79.892364] CPU: 0 PID: 58 Comm: kworker/0:8 Tainted: G W 5.18.0-rc7 #14
[ 79.901162] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 79.907898] Workqueue: events dbs_work_handler
[ 79.913341] unwind_backtrace from show_stack+0x10/0x14
[ 79.919588] show_stack from dump_stack_lvl+0x58/0x70
[ 79.925713] dump_stack_lvl from nmi_cpu_backtrace+0xe0/0x128
[ 79.932520] nmi_cpu_backtrace from nmi_trigger_cpumask_backtrace+0xec/0x184
[ 79.940585] nmi_trigger_cpumask_backtrace from trigger_single_cpu_backtrace+0x20/0x2c
[ 79.949523] trigger_single_cpu_backtrace from rcu_check_gp_kthread_starvation+0xf4/0x148
[ 79.958732] rcu_check_gp_kthread_starvation from rcu_sched_clock_irq+0xe1c/0xf8c
[ 79.967278] rcu_sched_clock_irq from update_process_times+0x88/0xc0
[ 79.974688] update_process_times from tick_sched_handle+0x48/0x54
[ 79.981854] tick_sched_handle from tick_sched_timer+0x48/0xac
[ 79.988576] tick_sched_timer from __hrtimer_run_queues+0x250/0x4e4
[ 79.995829] __hrtimer_run_queues from hrtimer_interrupt+0x128/0x2c8
[ 80.003205] hrtimer_interrupt from dmtimer_clockevent_interrupt+0x24/0x2c
[ 80.011215] dmtimer_clockevent_interrupt from __handle_irq_event_percpu+0x98/0x334
[ 80.019978] __handle_irq_event_percpu from handle_irq_event+0x38/0xc0
[ 80.027458] handle_irq_event from handle_level_irq+0xb4/0x1a8
[ 80.034335] handle_level_irq from handle_irq_desc+0x1c/0x2c
[ 80.040971] handle_irq_desc from generic_handle_arch_irq+0x2c/0x64
[ 80.048196] generic_handle_arch_irq from __irq_svc+0x90/0xbc
[ 80.054809] Exception stack(0xd0001f58 to 0xd0001fa0)
[ 80.060516] 1f40: c01015c8 00000000
[ 80.069486] 1f60: 0eaec000 00000000 fffffffe 600f0013 ffffffff d0385d64 016e3600 c3744a80
[ 80.078437] 1f80: 00000002 c3744a80 ffffffff d0001fa8 c01015c8 c01015d0 600f0113 ffffffff
[ 80.087237] __irq_svc from __do_softirq+0xa0/0x5fc
[ 80.093025] __do_softirq from __irq_exit_rcu+0x138/0x178
[ 80.099460] __irq_exit_rcu from irq_exit+0x8/0x28
[ 80.105230] irq_exit from call_with_stack+0x18/0x20
[ 80.111159] call_with_stack from __irq_svc+0x9c/0xbc
[ 80.117055] Exception stack(0xd0385d30 to 0xd0385d78)
[ 80.122806] 5d20: 00001901 f9e0042c 00000002 f9e00000
[ 80.131771] 5d40: c208dcc0 00000000 c208a680 c191a2fc 016e3600 11e1a300 c1109210 00000000
[ 80.140683] 5d60: fffffff9 d0385d80 c06d5d8c c06d30d8 600f0013 ffffffff
[ 80.147898] __irq_svc from clk_memmap_readl+0x28/0x90
[ 80.154011] clk_memmap_readl from omap3_noncore_dpll_program+0x7c/0x5e4
[ 80.161766] omap3_noncore_dpll_program from clk_change_rate+0x238/0x4f8
[ 80.169435] clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
[ 80.176806] clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
[ 80.183777] clk_set_rate from _set_opp+0x214/0x528
[ 80.189541] _set_opp from dev_pm_opp_set_rate+0xec/0x228
[ 80.195818] dev_pm_opp_set_rate from __cpufreq_driver_target+0x580/0x6fc
[ 80.203634] __cpufreq_driver_target from od_dbs_update+0xb4/0x168
[ 80.210877] od_dbs_update from dbs_work_handler+0x2c/0x60
[ 80.217323] dbs_work_handler from process_one_work+0x284/0x72c
[ 80.224217] process_one_work from worker_thread+0x28/0x4b0
[ 80.230730] worker_thread from kthread+0xe4/0x104
[ 80.236422] kthread from ret_from_fork+0x14/0x28
[ 80.241925] Exception stack(0xd0385fb0 to 0xd0385ff8)
[ 80.247670] 5fa0: 00000000 00000000 00000000 00000000
[ 80.256597] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 80.265480] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
-------------- next part --------------
[ 259.990401] rcu: INFO: rcu_sched self-detected stall on CPU
[ 259.997260] rcu: 0-...!: (2600 ticks this GP) idle=5af/1/0x40000004 softirq=7041/7041 fqs=0
[ 260.006798] (t=2600 jiffies g=16825 q=11323)
[ 260.011833] rcu: rcu_sched kthread timer wakeup didn't happen for 2599 jiffies! g16825 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 260.023878] rcu: Possible timer handling issue on cpu=0 timer-softirq=5692
[ 260.031436] rcu: rcu_sched kthread starved for 2600 jiffies! g16825 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 260.042517] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 260.052059] rcu: RCU grace-period kthread stack dump:
[ 260.057621] task:rcu_sched state:I stack: 0 pid: 11 ppid: 2 flags:0x00000000
[ 260.067142] __schedule from schedule+0x58/0xcc
[ 260.072792] schedule from schedule_timeout+0x78/0xf8
[ 260.078867] schedule_timeout from rcu_gp_fqs_loop+0x108/0x3d0
[ 260.085765] rcu_gp_fqs_loop from rcu_gp_kthread+0xa8/0x134
[ 260.092307] rcu_gp_kthread from kthread+0xe4/0x104
[ 260.098151] kthread from ret_from_fork+0x14/0x28
[ 260.103695] Exception stack(0xd0041fb0 to 0xd0041ff8)
[ 260.109490] 1fa0: 00000000 00000000 00000000 00000000
[ 260.118466] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 260.127383] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 260.134615] rcu: Stack dump where RCU GP kthread last ran:
[ 260.140672] NMI backtrace for cpu 0
[ 260.144724] CPU: 0 PID: 59 Comm: kworker/0:9 Tainted: G W 5.18.0-rc7 #14
[ 260.153508] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 260.160237] Workqueue: events dbs_work_handler
[ 260.165684] unwind_backtrace from show_stack+0x10/0x14
[ 260.171933] show_stack from dump_stack_lvl+0x58/0x70
[ 260.178024] dump_stack_lvl from nmi_cpu_backtrace+0xe0/0x128
[ 260.184833] nmi_cpu_backtrace from nmi_trigger_cpumask_backtrace+0xec/0x184
[ 260.192906] nmi_trigger_cpumask_backtrace from trigger_single_cpu_backtrace+0x20/0x2c
[ 260.201852] trigger_single_cpu_backtrace from rcu_check_gp_kthread_starvation+0xf4/0x148
[ 260.211059] rcu_check_gp_kthread_starvation from rcu_sched_clock_irq+0xa98/0xf8c
[ 260.219589] rcu_sched_clock_irq from update_process_times+0x88/0xc0
[ 260.227022] update_process_times from tick_sched_handle+0x48/0x54
[ 260.234180] tick_sched_handle from tick_sched_timer+0x48/0xac
[ 260.240922] tick_sched_timer from __hrtimer_run_queues+0x250/0x4e4
[ 260.248166] __hrtimer_run_queues from hrtimer_interrupt+0x128/0x2c8
[ 260.255530] hrtimer_interrupt from dmtimer_clockevent_interrupt+0x24/0x2c
[ 260.263551] dmtimer_clockevent_interrupt from __handle_irq_event_percpu+0x98/0x334
[ 260.272314] __handle_irq_event_percpu from handle_irq_event+0x38/0xc0
[ 260.279800] handle_irq_event from handle_level_irq+0xb4/0x1a8
[ 260.286673] handle_level_irq from handle_irq_desc+0x1c/0x2c
[ 260.293323] handle_irq_desc from generic_handle_arch_irq+0x2c/0x64
[ 260.300520] generic_handle_arch_irq from __irq_svc+0x90/0xbc
[ 260.307114] Exception stack(0xd0001f58 to 0xd0001fa0)
[ 260.312847] 1f40: c01015c8 00000000
[ 260.321798] 1f60: 0eaec000 00000000 fffffff8 60020013 ffffffff d0389d34 00000000 c3742a40
[ 260.330763] 1f80: 00000008 c3742a40 ffffffff d0001fa8 c01015c8 c01015d0 60020113 ffffffff
[ 260.339564] __irq_svc from __do_softirq+0xa0/0x5fc
[ 260.345366] __do_softirq from __irq_exit_rcu+0x138/0x178
[ 260.351794] __irq_exit_rcu from irq_exit+0x8/0x28
[ 260.357572] irq_exit from call_with_stack+0x18/0x20
[ 260.363513] call_with_stack from __irq_svc+0x9c/0xbc
[ 260.369403] Exception stack(0xd0389d00 to 0xd0389d48)
[ 260.375239] 9d00: 00000005 f9e00488 00000002 f9e00000 00000007 c208dcd8 c191a2fc c208dcc0
[ 260.384200] 9d20: 00000000 c208dcc0 00000005 c208dcd8 fffffff9 d0389d50 c06d59ec c06d30d8
[ 260.393026] 9d40: 60020013 ffffffff
[ 260.397059] __irq_svc from clk_memmap_readl+0x28/0x90
[ 260.403180] clk_memmap_readl from _omap3_dpll_write_clken+0x24/0x58
[ 260.410566] _omap3_dpll_write_clken from _omap3_noncore_dpll_lock+0x94/0xc4
[ 260.418690] _omap3_noncore_dpll_lock from omap3_noncore_dpll_program+0x14c/0x5e4
[ 260.427274] omap3_noncore_dpll_program from clk_change_rate+0x238/0x4f8
[ 260.434956] clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
[ 260.442340] clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
[ 260.449272] clk_set_rate from _set_opp+0x214/0x528
[ 260.455062] _set_opp from dev_pm_opp_set_rate+0xec/0x228
[ 260.461336] dev_pm_opp_set_rate from __cpufreq_driver_target+0x580/0x6fc
[ 260.469146] __cpufreq_driver_target from od_dbs_update+0xb4/0x168
[ 260.476378] od_dbs_update from dbs_work_handler+0x2c/0x60
[ 260.482821] dbs_work_handler from process_one_work+0x284/0x72c
[ 260.489705] process_one_work from worker_thread+0x28/0x4b0
[ 260.496232] worker_thread from kthread+0xe4/0x104
[ 260.501922] kthread from ret_from_fork+0x14/0x28
[ 260.507432] Exception stack(0xd0389fb0 to 0xd0389ff8)
[ 260.513179] 9fa0: 00000000 00000000 00000000 00000000
[ 260.522110] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 260.531008] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 260.539014] NMI backtrace for cpu 0
[ 260.543179] CPU: 0 PID: 59 Comm: kworker/0:9 Tainted: G W 5.18.0-rc7 #14
[ 260.551963] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 260.558674] Workqueue: events dbs_work_handler
[ 260.564131] unwind_backtrace from show_stack+0x10/0x14
[ 260.570357] show_stack from dump_stack_lvl+0x58/0x70
[ 260.576398] dump_stack_lvl from nmi_cpu_backtrace+0xe0/0x128
[ 260.583157] nmi_cpu_backtrace from nmi_trigger_cpumask_backtrace+0xec/0x184
[ 260.591223] nmi_trigger_cpumask_backtrace from trigger_single_cpu_backtrace+0x20/0x2c
[ 260.600156] trigger_single_cpu_backtrace from rcu_dump_cpu_stacks+0xf8/0x1ec
[ 260.608284] rcu_dump_cpu_stacks from rcu_sched_clock_irq+0xab8/0xf8c
[ 260.615746] rcu_sched_clock_irq from update_process_times+0x88/0xc0
[ 260.623134] update_process_times from tick_sched_handle+0x48/0x54
[ 260.630280] tick_sched_handle from tick_sched_timer+0x48/0xac
[ 260.636991] tick_sched_timer from __hrtimer_run_queues+0x250/0x4e4
[ 260.644235] __hrtimer_run_queues from hrtimer_interrupt+0x128/0x2c8
[ 260.651601] hrtimer_interrupt from dmtimer_clockevent_interrupt+0x24/0x2c
[ 260.659558] dmtimer_clockevent_interrupt from __handle_irq_event_percpu+0x98/0x334
[ 260.668288] __handle_irq_event_percpu from handle_irq_event+0x38/0xc0
[ 260.675782] handle_irq_event from handle_level_irq+0xb4/0x1a8
[ 260.682627] handle_level_irq from handle_irq_desc+0x1c/0x2c
[ 260.689264] handle_irq_desc from generic_handle_arch_irq+0x2c/0x64
[ 260.696486] generic_handle_arch_irq from __irq_svc+0x90/0xbc
[ 260.703067] Exception stack(0xd0001f58 to 0xd0001fa0)
[ 260.708780] 1f40: c01015c8 00000000
[ 260.717753] 1f60: 0eaec000 00000000 fffffff8 60020013 ffffffff d0389d34 00000000 c3742a40
[ 260.726702] 1f80: 00000008 c3742a40 ffffffff d0001fa8 c01015c8 c01015d0 60020113 ffffffff
[ 260.735511] __irq_svc from __do_softirq+0xa0/0x5fc
[ 260.741288] __do_softirq from __irq_exit_rcu+0x138/0x178
[ 260.747659] __irq_exit_rcu from irq_exit+0x8/0x28
[ 260.753441] irq_exit from call_with_stack+0x18/0x20
[ 260.759337] call_with_stack from __irq_svc+0x9c/0xbc
[ 260.765228] Exception stack(0xd0389d00 to 0xd0389d48)
[ 260.771072] 9d00: 00000005 f9e00488 00000002 f9e00000 00000007 c208dcd8 c191a2fc c208dcc0
[ 260.780031] 9d20: 00000000 c208dcc0 00000005 c208dcd8 fffffff9 d0389d50 c06d59ec c06d30d8
[ 260.788844] 9d40: 60020013 ffffffff
[ 260.792883] __irq_svc from clk_memmap_readl+0x28/0x90
[ 260.798946] clk_memmap_readl from _omap3_dpll_write_clken+0x24/0x58
[ 260.806301] _omap3_dpll_write_clken from _omap3_noncore_dpll_lock+0x94/0xc4
[ 260.814421] _omap3_noncore_dpll_lock from omap3_noncore_dpll_program+0x14c/0x5e4
[ 260.823015] omap3_noncore_dpll_program from clk_change_rate+0x238/0x4f8
[ 260.830702] clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
[ 260.838091] clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
[ 260.845045] clk_set_rate from _set_opp+0x214/0x528
[ 260.850797] _set_opp from dev_pm_opp_set_rate+0xec/0x228
[ 260.857071] dev_pm_opp_set_rate from __cpufreq_driver_target+0x580/0x6fc
[ 260.864863] __cpufreq_driver_target from od_dbs_update+0xb4/0x168
[ 260.872082] od_dbs_update from dbs_work_handler+0x2c/0x60
[ 260.878522] dbs_work_handler from process_one_work+0x284/0x72c
[ 260.885357] process_one_work from worker_thread+0x28/0x4b0
[ 260.891875] worker_thread from kthread+0xe4/0x104
[ 260.897568] kthread from ret_from_fork+0x14/0x28
[ 260.903073] Exception stack(0xd0389fb0 to 0xd0389ff8)
[ 260.908814] 9fa0: 00000000 00000000 00000000 00000000
[ 260.917745] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 260.926641] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
-------------- next part --------------
[ 112.951462] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 112.958658] (detected by 0, t=2602 jiffies, g=4733, q=11060)
[ 112.965167] rcu: All QSes seen, last rcu_sched kthread activity 2602 (-18706--21308), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 112.977570] rcu: rcu_sched kthread starved for 2602 jiffies! g4733 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 112.988383] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 112.997921] rcu: RCU grace-period kthread stack dump:
[ 113.003480] task:rcu_sched state:R running task stack: 0 pid: 11 ppid: 2 flags:0x00000000
[ 113.014832] __schedule from schedule+0x58/0xcc
[ 113.020467] schedule from schedule_timeout+0x78/0xf8
[ 113.026535] schedule_timeout from rcu_gp_fqs_loop+0x108/0x3d0
[ 113.033431] rcu_gp_fqs_loop from rcu_gp_kthread+0xa8/0x134
[ 113.039978] rcu_gp_kthread from kthread+0xe4/0x104
[ 113.045824] kthread from ret_from_fork+0x14/0x28
[ 113.051356] Exception stack(0xd0041fb0 to 0xd0041ff8)
[ 113.057147] 1fa0: 00000000 00000000 00000000 00000000
[ 113.066104] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 113.075023] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 113.082279] rcu: Stack dump where RCU GP kthread last ran:
[ 113.088341] NMI backtrace for cpu 0
[ 113.092378] CPU: 0 PID: 62 Comm: kworker/0:12 Not tainted 5.18.0-rc7 #14
[ 113.099831] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 113.106561] Workqueue: events dbs_work_handler
[ 113.112021] unwind_backtrace from show_stack+0x10/0x14
[ 113.118272] show_stack from dump_stack_lvl+0x58/0x70
[ 113.124379] dump_stack_lvl from nmi_cpu_backtrace+0xe0/0x128
[ 113.131208] nmi_cpu_backtrace from nmi_trigger_cpumask_backtrace+0xec/0x184
[ 113.139264] nmi_trigger_cpumask_backtrace from trigger_single_cpu_backtrace+0x20/0x2c
[ 113.148200] trigger_single_cpu_backtrace from rcu_check_gp_kthread_starvation+0xf4/0x148
[ 113.157383] rcu_check_gp_kthread_starvation from rcu_sched_clock_irq+0xe1c/0xf8c
[ 113.165916] rcu_sched_clock_irq from update_process_times+0x88/0xc0
[ 113.173339] update_process_times from tick_sched_handle+0x48/0x54
[ 113.180505] tick_sched_handle from tick_sched_timer+0x48/0xac
[ 113.187240] tick_sched_timer from __hrtimer_run_queues+0x250/0x4e4
[ 113.194469] __hrtimer_run_queues from hrtimer_interrupt+0x128/0x2c8
[ 113.201827] hrtimer_interrupt from dmtimer_clockevent_interrupt+0x24/0x2c
[ 113.209839] dmtimer_clockevent_interrupt from __handle_irq_event_percpu+0x98/0x334
[ 113.218603] __handle_irq_event_percpu from handle_irq_event+0x38/0xc0
[ 113.226084] handle_irq_event from handle_level_irq+0xb4/0x1a8
[ 113.232972] handle_level_irq from handle_irq_desc+0x1c/0x2c
[ 113.239613] handle_irq_desc from generic_handle_arch_irq+0x2c/0x64
[ 113.246842] generic_handle_arch_irq from call_with_stack+0x18/0x20
[ 113.254073] call_with_stack from __irq_svc+0x9c/0xbc
[ 113.259973] Exception stack(0xd0395d40 to 0xd0395d88)
[ 113.265824] 5d40: 00000005 f9e00488 00000000 00000000 c208dcc0 00001901 c208a680 c191a2fc
[ 113.274781] 5d60: 00000000 c208dcc0 c1109210 c208dcd8 fffffff9 d0395d90 c06d5ef0 c06d6104
[ 113.283594] 5d80: 60070013 ffffffff
[ 113.287629] __irq_svc from omap3_noncore_dpll_program+0x3f4/0x5e4
[ 113.294907] omap3_noncore_dpll_program from clk_change_rate+0x238/0x4f8
[ 113.302572] clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
[ 113.309950] clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
[ 113.316908] clk_set_rate from _set_opp+0x260/0x528
[ 113.322680] _set_opp from dev_pm_opp_set_rate+0xec/0x228
[ 113.328969] dev_pm_opp_set_rate from __cpufreq_driver_target+0x580/0x6fc
[ 113.336784] __cpufreq_driver_target from od_dbs_update+0xb4/0x168
[ 113.344024] od_dbs_update from dbs_work_handler+0x2c/0x60
[ 113.350466] dbs_work_handler from process_one_work+0x284/0x72c
[ 113.357326] process_one_work from worker_thread+0x28/0x4b0
[ 113.363841] worker_thread from kthread+0xe4/0x104
[ 113.369532] kthread from ret_from_fork+0x14/0x28
[ 113.375038] Exception stack(0xd0395fb0 to 0xd0395ff8)
[ 113.380793] 5fa0: 00000000 00000000 00000000 00000000
[ 113.389727] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 113.398614] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
More information about the linux-arm-kernel
mailing list