[U-Boot] [PATCH 2/2] rockchip: rk3399: rockpro64: enable force power on reset workaround
Lee Jones
lee.jones at linaro.org
Thu May 19 01:17:14 PDT 2022
On Wed, 18 May 2022, Peter Geis wrote:
> On Wed, May 18, 2022 at 7:56 AM Lee Jones <lee.jones at linaro.org> wrote:
> >
> > Looping int a few relevant/active kernel people/lists for full coverage.
> >
> > On Sun, 01 Dec 2019, Hugh Cole-Baker wrote:
> > > > On 29 Nov 2019, at 01:06, Vasily Khoruzhick <anarsoul at gmail.com> wrote:
> > > > On Thu, Nov 28, 2019 at 4:59 PM Kever Yang <kever.yang at rock-chips.com> wrote:
> > > >>
> > > >> Hi Vasily,
> > > >>
> > > >> On 2019/11/28 下午11:51, Vasily Khoruzhick wrote:
> > > >>> On Thu, Nov 28, 2019 at 1:23 AM Kever Yang <kever.yang at rock-chips.com> wrote:
> > > >>>> Hi Vasily,
> > > >>>>
> > > >>>> I think this should not be needed, see comments below.
> > > >>> Hi Kever,
> > > >>>
> > > >>> I've spent 2 weeks of my evenings debugging this issue but
> > > >>
> > > >> I can understand you work pretty hard on make it work, it's not so easy
> > > >> to identify the root cause
> > > >>
> > > >> some times, thanks very much for working on this.
> > > >>
> > > >>> unfortunately I don't have a proper fix. This is the only solution
> > > >>> that makes my rockpro64 reboot reliably with mainline u-boot and ATF.
> > > >>> See my comments below.
> > >
> > > I also had a problem where Linux would hang or panic after rebooting, with
> > > mainline u-boot and ATF on a rockpro64. This patch does fix the issue for me,
> > > I have tested it by performing 10 reboots from Linux in a row and I've seen
> > > no hangs or panics.
> > >
> > > I noticed the Armbian project have recently included a patch to ATF [1] which
> > > switches all power domains on before ATF performs a soft reset. I have also
> > > tested using u-boot mainline, without any patches to u-boot, but including ATF
> > > patched with your reset fix [2] and the Armbian power domains patch [1]. This
> > > also fixes the same hanging on reboot issue for me without modifications to
> > > u-boot, I've also tested 10 reboots in a row with this ATF and seen no hangs.
> > >
> > > So this u-boot patch may not be needed if ATF is patched instead to switch
> > > power domains on before soft reset.
> > >
> > > FWIW, when I was able to see panic messages from Linux when it panicked on
> > > boot, the call trace always seemed to include rockchip_pd_power_off() [3].
> > >
> > > [1] https://github.com/armbian/build/blob/master/patch/atf/atf-rk3399/switch-power-domains-on-before-reset.patch
> > > [2] https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/2512
> > > [3] https://gist.github.com/sigmaris/c0e155c8cb0a325d84f549185f9a568c
> >
> > This last paste looks remarkably similar to an issue currently seen on
> > the Radxa ROCK Pi 4B (RK3399) during power-up after a soft reboot
> > (`sudo reboot`) is issued. We're presently running v5.15.35 [0].
>
> Good Evening,
Hi, Peter,
Thank you so much for your reply.
> That's definitely not stock v5.15.35, it's been tagged as an android kernel.
> 5.15.35-android13-5-00092-g525d77310a20
It's not stock, no.
Although the differences from RockPi's perspective are minimal.
The main difference is the way the kernel is configured.
It's GKI:
https://android.googlesource.com/kernel/common/+/refs/heads/android13-5.15/arch/arm64/configs/gki_defconfig
Plus a few non-GKI specifics:
https://android.googlesource.com/kernel/common/+/refs/heads/android13-5.15/arch/arm64/configs/rockpi4_gki.fragment
> > It's not clear how this issue (present 3 years ago) was finally
> > resolved. From the thread, it looks as if the fix might have made its
> > way into ATF, but I'm 87.6% sure ATF is not running on this platform
> > (yet).
>
> The rk3399 SoC has a hardware bug where the power domains are not
> reset upon a soft reset. This leads to situations like this one where
> power domains are shut down during shutdown but aren't restored on
> reboot.
I assume this isn't something we can patch in the kernel driver?
> Mainline TF-A was patched to force all power domains online
> when a soft reboot is triggered, which solved that issue.
Okay, this is what I figured.
> What particular issues are you having initializing modern u-boot on
> this device?
This is the output: https://pastebin.ubuntu.com/p/d5DmsSBnrR/
Speaking with one of the guys who supports RockPi 4 in AOSP, he
suspects the DDR settings. Apparently settings for older SoCs
sometimes get clobbered when support for newer SoCs is added.
I am yet to investigate the u-boot story in any detail.
It's on my TODO list for today.
> Is there a particular reason it isn't using Mainline TF-A?
We're not using Trusted Firmware yet.
Although I'm starting to think this should be re-prioritised.
> I've also run into issues on rk356x where the regulator powering a
> power domain isn't powered due to a soft reset, which also causes
> faults like this. Set your main regulators to always-on and see if it
> helps with the issue.
I'll do that. Thanks for the tip.
Our main issue currently is an RCU-lock-up, again on soft reboot:
[ 21.226951][ C0] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 21.227637][ C0] rcu: 5-...!: (1 GPs behind) idle=3de/1/0x4000000000000000 softirq=9/10 fqs=3 last_accelerate: 0000/efb9 dyntick_enabled: 0
[ 21.228890][ C0] (detected by 0, t=5252 jiffies, g=-1167, q=46)
Do you think these issues could all be related?
Thanks ever so much for your reply Peter.
You've potentially saved us hours and hours of debugging.
Kind regards,
Lee
> > Note that the u-boot we're using is also quite old:
> >
> > U-Boot 2019.10-09248-g8511c75bb4 (Jan 08 2020 - 17:13:03 -0800)
> >
> > ... so this could easily be the root cause. The current plan is to
> > try to update this ASAP. However early attempts are yet to result in
> > a successful boot.
> >
> > I see that Brian recently added a few patches related to PD/DVFS, but
> > again, these appear to be ATF related.
> >
> > Would anyone be able to shed some light onto this for me please?
> >
> > As always, any help would be gratefully received.
> >
> > Kind regards,
> > Lee
> >
> > [0]
> > Full reboot log can be seen at: https://pastebin.ubuntu.com/p/MjZP2V6kQ3/
> >
> > [ 0.699736][ T1] initcall __initstub__kmod_iommu__362_155_iommu_subsys_init4+0x0/0x8 returned 0 after 0 usecs
> > [ 0.700737][ T1] calling __initstub__kmod_rockchip_iommu__348_1415_rk_iommu_init4+0x0/0x8 @ 1
> > [ 0.702238][ C5] SError Interrupt on CPU5, code 0xbf000002 -- SError
> > [ 0.702248][ C5] CPU: 5 PID: 48 Comm: kworker/5:1 Not tainted 5.15.35-android13-5-00092-g525d77310a20 #1
> > [ 0.702261][ C5] Hardware name: Radxa ROCK Pi 4B (DT)
> > [ 0.702266][ C5] Workqueue: pm genpd_power_off_work_fn.cfi_jt
> > [ 0.702289][ C5] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [ 0.702301][ C5] pc : regmap_mmio_read32le+0x14/0x2c
> > [ 0.702318][ C5] lr : regmap_mmio_read+0x68/0xd0
> > [ 0.702331][ C5] sp : ffffffc00b6d3b40
> > [ 0.702335][ C5] x29: ffffffc00b6d3b40 x28: 0000000000000000 x27: 0000000000000000
> > [ 0.702351][ C5] x26: ffffff8000923680 x25: ffffffc009abc2a0 x24: ffffff8000930c00
> > [ 0.702364][ C5] x23: 0000000000000014 x22: ffffff8000930c00 x21: 0000000000000008
> > [ 0.702378][ C5] x20: ffffff8000922300 x19: ffffff8000923680 x18: ffffffc00b66d058
> > [ 0.702391][ C5] x17: 000000000000ba7e x16: ffffffc00a4dee04 x15: 000000000000b67e
> > [ 0.702405][ C5] x14: 00000000028dd7a0 x13: 0000000000000040 x12: 0000000000000000
> > [ 0.702419][ C5] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000005
> > [ 0.702432][ C5] x8 : 0000000000000000 x7 : 00756d6d6f692e30 x6 : 3030383035366666
> > [ 0.702445][ C5] x5 : 0000000000000001 x4 : 028dea248fba33d6 x3 : 0000000000000000
> > [ 0.702457][ C5] x2 : ffffff8000923680 x1 : 0000000000000008 x0 : 0000000000000000
> > [ 0.702472][ C5] Kernel panic - not syncing: Asynchronous SError Interrupt
> > [ 0.702477][ C5] CPU: 5 PID: 48 Comm: kworker/5:1 Not tainted 5.15.35-android13-5-00092-g525d77310a20 #1
> > [ 0.702487][ C5] Hardware name: Radxa ROCK Pi 4B (DT)
> > [ 0.702492][ C5] Workqueue: pm genpd_power_off_work_fn.cfi_jt
> > [ 0.702506][ C5] Call trace:
> > [ 0.702508][ C5] dump_backtrace.cfi_jt+0x0/0x8
> > [ 0.702525][ C5] dump_stack_lvl+0x80/0xb8
> > [ 0.702536][ C5] panic+0x180/0x444
> > [ 0.702547][ C5] arm64_serror_panic+0x1c0/0x210
> > [ 0.702561][ C5] do_serror+0x17c/0x218
> > [ 0.702572][ C5] el1h_64_error_handler+0x38/0x50
> > [ 0.702581][ C5] el1h_64_error+0x7c/0x80
> > [ 0.702589][ C5] regmap_mmio_read32le+0x14/0x2c
> > [ 0.702603][ C5] _regmap_bus_reg_read+0x3c/0x90
> > [ 0.702614][ C5] _regmap_read+0xb0/0x24c
> > [ 0.702623][ C5] rockchip_pd_power+0x6c4/0xbc0
> > [ 0.702638][ C5] rockchip_pd_power_off+0x18/0x28
> > [ 0.702652][ C5] _genpd_power_off+0x178/0x388
> > [ 0.702663][ C5] genpd_power_off+0x188/0x2e4
> > [ 0.702673][ C5] genpd_power_off_work_fn+0x54/0xe4
> > [ 0.702683][ C5] process_one_work+0x254/0x5a0
> > [ 0.702696][ C5] worker_thread+0x3ec/0x920
> > [ 0.702707][ C5] kthread+0x168/0x1dc
> > [ 0.702716][ C5] ret_from_fork+0x10/0x20
> > [ 0.702726][ C5] SMP: stopping secondary CPUs
> >
> >
> > _______________________________________________
> > Linux-rockchip mailing list
> > Linux-rockchip at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-rockchip
--
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog
More information about the Linux-rockchip
mailing list