rk3399 fails to boot since v6.12.7
Marc Zyngier
maz at kernel.org
Thu Jan 30 01:58:58 PST 2025
Hi Christoph,
Thanks for reporting this.
On Wed, 29 Jan 2025 21:31:53 +0000,
Christoph Fritz <chf.fritz at googlemail.com> wrote:
>
> Hello Marc,
>
> since 773c05f417fa1 ("irqchip/gic-v3: Work around insecure GIC
> integrations") landed in stable v6.12.7 as 0bf32f482887, here the
> rk3399 fails to boot (~4 out of 10 times) because OP-TEE panics (gets a
> secure interrupt that it cannot handle).
I think it may actually get a *non-secure* interrupt.
>
> Setup is:
> - BL31 proprietary (since mainline TF-A has no DMA)
> - OP-TEE mainline version: 3.20
> - Kernel v6.12.7
>
> <snip>
> [ 0.000000] GICv3: Broken GIC integration, security disabled
> <snip>
> E/TC:4 0 Panic 'Secure interrupt handler not defined' at core/kernel/interrupt.c:139 <itr_core_handler>
> E/TC:4 0 TEE load address @ 0x30000000
> E/TC:4 0 Call stack:
> E/TC:4 0 0x300091f8
> E/TC:4 0 0x30016664
> E/TC:4 0 0x30015710
> E/TC:4 0 0x30005714
> [ 26.087363] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 26.087925] rcu: (detected by 2, t=21002 jiffies, g=2233, q=2416 ncpus=6)
> [ 26.088530] rcu: All QSes seen, last rcu_preempt kthread activity 21002 (4294693363-4294672361), jiffies_till_next_fqs=3, root ->qsmask 0x0
> [ 26.089623] rcu: rcu_preempt kthread timer wakeup didn't happen for 20999 jiffies! g2233 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200
> [ 26.090617] rcu: Possible timer handling issue on cpu=4 timer-softirq=293
> [ 26.091218] rcu: rcu_preempt kthread starved for 21002 jiffies! g2233 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=4
> [ 26.092131] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> [ 26.092926] rcu: RCU grace-period kthread stack dump:
> [ 26.093369] task:rcu_preempt state:R stack:0 pid:16 tgid:16 ppid:2 flags:0x00000008
> [ 26.094192] Call trace:
> [ 26.094409] __switch_to+0xf0/0x14c
> [ 26.094728] __schedule+0x264/0xa90
> [ 26.095040] schedule+0x34/0x104
> [ 26.095329] schedule_timeout+0x80/0xf4
> [ 26.095672] rcu_gp_fqs_loop+0x14c/0x4a4
> [ 26.096027] rcu_gp_kthread+0x138/0x164
> [ 26.096369] kthread+0x114/0x118
> [ 26.096661] ret_from_fork+0x10/0x20
> [ 26.096982] rcu: Stack dump where RCU GP kthread last ran:
> [ 26.097463] Sending NMI from CPU 2 to CPUs 4:
> [ 46.276364] sched: DL replenish lagged too much
>
> Is it too late for the kernel to disable the "security" since OP-TEE
> assumes it is enabled?
Well, it was never effective security the first place, but clearly
this effect is not expected (my own machine doesn't have anything
running on the secure side).
> Any ideas?
I think this calls for a revert of this patch, potentially at the
expense if NMI support on this machine. Could you show how SCR_EL3.FIQ
is configured on this machine? Mine shows:
[ 0.000000] GICv3: GICD_CTRL.DS=0, SCR_EL3.FIQ=0
and I suspect yours has FIQ=1. If that's the case, we could use that
as the discriminant.
However, this machine has a much bigger issues. For things to work as
expected, the GIC driver must preserve all the secure configuration,
and nothing does that today.
So even before this patch, your secure payload won't get any
interrupt, as we blindly configure everything to be Group-1NS.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the Linux-rockchip
mailing list