cpuidle on i.MX8MQ

Martin Kepplinger martin.kepplinger at puri.sm
Mon Nov 29 05:40:04 PST 2021


Am Donnerstag, dem 04.11.2021 um 13:04 +0200 schrieb Abel Vesa:
> On 21-11-03 13:09:15, Martin Kepplinger wrote:
> > Am Dienstag, dem 02.11.2021 um 11:55 +0100 schrieb Alexander Stein:
> > > Hello,
> > > 
> > > I was hit by the errata e11171 on imx8mq on our custom board. I
> > > found
> > > [1] from over 2 years ago, and the even older patchset [2].
> > > Is there some final conclusion or fix regarding this errata? From
> > > what
> > > I understand the proposed change is apparently not acceptable in
> > > mainline for several reasons. I'm wondering what's the current
> > > status.
> 
> Unfortunately, there is not gonna be an upstream solution for this
> errata. Long story short, the SOC is missing wakeup lines from gic
> to gpc. This means the IPIs are affected. So, knowing all that,
> in order to wake up a core, you need to write a bit in some register
> in gpc. The SW workaround (non upstreamable) I provided does exactly
> that by hijacking the gic_raise_softirq __smp_cross_call handler and
> registers a wrapper over it which also calls into ATF (using SIP)
> and wakes up that specific core by writing into the gpc register.
> 
> There is no other possible way to wake up a core on 8MQ.
> 
> > > As suggested at that time, the only solution (right now) is to
> > > disable
> > > cpuidle on imx8mq?
> > > 
> 
> Yes, the vendor actually suggests that, but you can use the mentioned
> hack.
> 
> > > Best regards,
> > > Alexander
> > > 
> > > [1] https://lkml.org/lkml/2019/6/10/350
> > > [2] https://lkml.org/lkml/2019/3/27/542
> > > 
> > 
> > Hi Alexander, hi Abel,
> > 
> > At this point my understanding is basically the same. We carry (a
> > slight variation of) the above in our tree ever since in oder to
> > have
> > the cpu-sleep sleep state. Not using it is not acceptable to us :)
> > 
> > Until now there's one internal API change we need to revert (bring
> > back) in order for this to work. For reference, this is our current
> > implementation:
> > 
> > https://source.puri.sm/martin.kepplinger/linux-next/-/compare/0b90c3622755e0155632d8cc25edd4eb7f875968...ce4803745a180adc8d87891d4ff8dff1c7bd5464
> > 
> > Abel, can you still say that, in case this solution won't apply
> > anymore
> > in the future, that you would be available to create an update?
> > 
> 
> I'll try to find a workaround soon, based on the same general idea
> behind the current one you guys are using. I'll do this in my own
> time
> since the company does not allocate resources for 8MQ cpuidle support
> anymore.
> 
> > Can you even imagine a possibly acceptable solution for mainline to
> > this? Nothing is completely set in stone with Linux :)
> 
> I believe Marc was pretty clear about not accepting such a workaround
> (and, TBH, it makes perfect sense not to).
> 
> Since I don't think there is any other way that would go around the
> gic driver, I believe this has hit an end when it comes to upstream
> support.
> 
> Sorry about that.
> 
> I'm open to any suggestions though.
> 
> 

hi Abel, since there's the link to the workaround implementation here,
I'd like to show you a bug when transitioning to s2idle. I don't see
that when removing all these cpu-sleep additions (linked above). (I
might send as a seperate bugreport later)

Can you see how that can cause this rcu stall? it looks like a problem
with a timer...


 65.476456] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 65.476615] rcu: 0-...!: (1 ticks this GP)
idle=42f/1/0x4000000000000004 softirq=9151/9151 fqs=0 
[ 65.476676] (t=8974 jiffies g=11565 q=2)
[ 65.476703] rcu: rcu_preempt kthread timer wakeup didn't happen for
8973 jiffies! g11565 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 65.476715] rcu: Possible timer handling issue on cpu=0 timer-
softirq=2032
[ 65.476730] rcu: rcu_preempt kthread starved for 8974 jiffies! g11565
f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 65.476742] rcu: Unless rcu_preempt kthread gets sufficient CPU time,
OOM is now expected behavior.
[ 65.476749] rcu: RCU grace-period kthread stack dump:
[ 65.476764] task:rcu_preempt state:I stack: 0 pid: 13 ppid: 2
flags:0x00000008
[ 65.476814] Call trace:
[ 65.476825] __switch_to+0x138/0x190
[ 65.476975] __schedule+0x288/0x6ec
[ 65.477044] schedule+0x7c/0x110
[ 65.477059] schedule_timeout+0xa4/0x1c4
[ 65.477085] rcu_gp_fqs_loop+0x13c/0x51c
[ 65.477126] rcu_gp_kthread+0x1a4/0x264
[ 65.477136] kthread+0x15c/0x170
[ 65.477167] ret_from_fork+0x10/0x20
[ 65.477186] rcu: Stack dump where RCU GP kthread last ran:
[ 65.477194] Task dump for CPU 0:
[ 65.477202] task:swapper/0 state:R running task stack: 0 pid: 0 ppid:
0 flags:0x0000000a
[ 65.477223] Call trace:
[ 65.477226] dump_backtrace+0x0/0x1e4
[ 65.477246] show_stack+0x24/0x30
[ 65.477256] sched_show_task+0x15c/0x180
[ 65.477293] dump_cpu_task+0x50/0x60
[ 65.477327] rcu_check_gp_kthread_starvation+0x128/0x148
[ 65.477335] rcu_sched_clock_irq+0xb74/0xf04
[ 65.477348] update_process_times+0xa8/0xf4
[ 65.477388] tick_sched_handle+0x3c/0x60
[ 65.477409] tick_sched_timer+0x58/0xb0
[ 65.477416] __hrtimer_run_queues+0x18c/0x370
[ 65.477428] hrtimer_interrupt+0xf4/0x250
[ 65.477437] arch_timer_handler_phys+0x40/0x50
[ 65.477477] handle_percpu_devid_irq+0x94/0x250
[ 65.477505] handle_domain_irq+0x6c/0xa0
[ 65.477516] gic_handle_irq+0xc4/0x144
[ 65.477527] call_on_irq_stack+0x2c/0x54
[ 65.477534] do_interrupt_handler+0x5c/0x70
[ 65.477544] el1_interrupt+0x30/0x80
[ 65.477556] el1h_64_irq_handler+0x18/0x24
[ 65.477567] el1h_64_irq+0x78/0x7c
[ 65.477575] cpuidle_enter_s2idle+0x14c/0x1ac
[ 65.477617] do_idle+0x25c/0x2a0
[ 65.477644] cpu_startup_entry+0x30/0x80
[ 65.477656] rest_init+0xec/0x100
[ 65.477666] arch_call_rest_init+0x1c/0x28
[ 65.477700] start_kernel+0x6e0/0x720
[ 65.477709] __primary_switched+0xc0/0xc8
[ 65.477751] Task dump for CPU 0:
[ 65.477757] task:swapper/0 state:R running task stack: 0 pid: 0 ppid:
0 flags:0x0000000a
[ 65.477770] Call trace:
[ 65.477773] dump_backtrace+0x0/0x1e4
[ 65.477788] show_stack+0x24/0x30
[ 65.477796] sched_show_task+0x15c/0x180
[ 65.477804] dump_cpu_task+0x50/0x60
[ 65.477812] rcu_dump_cpu_stacks+0xf4/0x138
[ 65.477820] rcu_sched_clock_irq+0xb78/0xf04
[ 65.477829] update_process_times+0xa8/0xf4
[ 65.477838] tick_sched_handle+0x3c/0x60
[ 65.477845] tick_sched_timer+0x58/0xb0
[ 65.477854] __hrtimer_run_queues+0x18c/0x370
[ 65.477863] hrtimer_interrupt+0xf4/0x250
[ 65.477873] arch_timer_handler_phys+0x40/0x50
[ 65.477880] handle_percpu_devid_irq+0x94/0x250
[ 65.477888] handle_domain_irq+0x6c/0xa0
[ 65.477897] gic_handle_irq+0xc4/0x144
[ 65.477903] call_on_irq_stack+0x2c/0x54
[ 65.477910] do_interrupt_handler+0x5c/0x70
[ 65.477921] el1_interrupt+0x30/0x80
[ 65.477929] el1h_64_irq_handler+0x18/0x24
[ 65.477937] el1h_64_irq+0x78/0x7c
[ 65.477944] cpuidle_enter_s2idle+0x14c/0x1ac
[ 65.477952] do_idle+0x25c/0x2a0
[ 65.477959] cpu_startup_entry+0x30/0x80
[ 65.477970] rest_init+0xec/0x100
[ 65.477977] arch_call_rest_init+0x1c/0x28
[ 65.477988] start_kernel+0x6e0/0x720
[ 65.477995] __primary_switched+0xc0/0xc8





More information about the linux-arm-kernel mailing list