[U-Boot] [PATCH 2/2] rockchip: rk3399: rockpro64: enable force power on reset workaround
Peter Geis
pgwipeout at gmail.com
Thu May 19 04:16:44 PDT 2022
On Thu, May 19, 2022 at 4:17 AM Lee Jones <lee.jones at linaro.org> wrote:
>
> On Wed, 18 May 2022, Peter Geis wrote:
> > On Wed, May 18, 2022 at 7:56 AM Lee Jones <lee.jones at linaro.org> wrote:
> > >
> > > Looping int a few relevant/active kernel people/lists for full coverage.
> > >
> > > On Sun, 01 Dec 2019, Hugh Cole-Baker wrote:
> > > > > On 29 Nov 2019, at 01:06, Vasily Khoruzhick <anarsoul at gmail.com> wrote:
> > > > > On Thu, Nov 28, 2019 at 4:59 PM Kever Yang <kever.yang at rock-chips.com> wrote:
> > > > >>
> > > > >> Hi Vasily,
> > > > >>
> > > > >> On 2019/11/28 下午11:51, Vasily Khoruzhick wrote:
> > > > >>> On Thu, Nov 28, 2019 at 1:23 AM Kever Yang <kever.yang at rock-chips.com> wrote:
> > > > >>>> Hi Vasily,
> > > > >>>>
> > > > >>>> I think this should not be needed, see comments below.
> > > > >>> Hi Kever,
> > > > >>>
> > > > >>> I've spent 2 weeks of my evenings debugging this issue but
> > > > >>
> > > > >> I can understand you work pretty hard on make it work, it's not so easy
> > > > >> to identify the root cause
> > > > >>
> > > > >> some times, thanks very much for working on this.
> > > > >>
> > > > >>> unfortunately I don't have a proper fix. This is the only solution
> > > > >>> that makes my rockpro64 reboot reliably with mainline u-boot and ATF.
> > > > >>> See my comments below.
> > > >
> > > > I also had a problem where Linux would hang or panic after rebooting, with
> > > > mainline u-boot and ATF on a rockpro64. This patch does fix the issue for me,
> > > > I have tested it by performing 10 reboots from Linux in a row and I've seen
> > > > no hangs or panics.
> > > >
> > > > I noticed the Armbian project have recently included a patch to ATF [1] which
> > > > switches all power domains on before ATF performs a soft reset. I have also
> > > > tested using u-boot mainline, without any patches to u-boot, but including ATF
> > > > patched with your reset fix [2] and the Armbian power domains patch [1]. This
> > > > also fixes the same hanging on reboot issue for me without modifications to
> > > > u-boot, I've also tested 10 reboots in a row with this ATF and seen no hangs.
> > > >
> > > > So this u-boot patch may not be needed if ATF is patched instead to switch
> > > > power domains on before soft reset.
> > > >
> > > > FWIW, when I was able to see panic messages from Linux when it panicked on
> > > > boot, the call trace always seemed to include rockchip_pd_power_off() [3].
> > > >
> > > > [1] https://github.com/armbian/build/blob/master/patch/atf/atf-rk3399/switch-power-domains-on-before-reset.patch
> > > > [2] https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/2512
> > > > [3] https://gist.github.com/sigmaris/c0e155c8cb0a325d84f549185f9a568c
> > >
> > > This last paste looks remarkably similar to an issue currently seen on
> > > the Radxa ROCK Pi 4B (RK3399) during power-up after a soft reboot
> > > (`sudo reboot`) is issued. We're presently running v5.15.35 [0].
> >
> > Good Evening,
>
> Hi, Peter,
>
> Thank you so much for your reply.
>
> > That's definitely not stock v5.15.35, it's been tagged as an android kernel.
> > 5.15.35-android13-5-00092-g525d77310a20
>
> It's not stock, no.
>
> Although the differences from RockPi's perspective are minimal.
>
> The main difference is the way the kernel is configured.
>
> It's GKI:
>
> https://android.googlesource.com/kernel/common/+/refs/heads/android13-5.15/arch/arm64/configs/gki_defconfig
>
> Plus a few non-GKI specifics:
>
> https://android.googlesource.com/kernel/common/+/refs/heads/android13-5.15/arch/arm64/configs/rockpi4_gki.fragment
Ah, so close enough to not matter much.
>
> > > It's not clear how this issue (present 3 years ago) was finally
> > > resolved. From the thread, it looks as if the fix might have made its
> > > way into ATF, but I'm 87.6% sure ATF is not running on this platform
> > > (yet).
> >
> > The rk3399 SoC has a hardware bug where the power domains are not
> > reset upon a soft reset. This leads to situations like this one where
> > power domains are shut down during shutdown but aren't restored on
> > reboot.
>
> I assume this isn't something we can patch in the kernel driver?
As far as I know it's being worked on by others, I have some ideas for
this as well but I've been focused on rk356x lately.
>
> > Mainline TF-A was patched to force all power domains online
> > when a soft reboot is triggered, which solved that issue.
>
> Okay, this is what I figured.
>
> > What particular issues are you having initializing modern u-boot on
> > this device?
>
> This is the output: https://pastebin.ubuntu.com/p/d5DmsSBnrR/
>
> Speaking with one of the guys who supports RockPi 4 in AOSP, he
> suspects the DDR settings. Apparently settings for older SoCs
> sometimes get clobbered when support for newer SoCs is added.
The rk3399 TPL code is specific to the rk3399 and it really hasn't
been touched much recently. I'm using the latest Mainline U-Boot on
both my Rockpro64 and Pinephone-Pro. I don't see TF-A being loaded,
which should happen between:
Trying to boot from BOOTROM
Returning to boot ROM...
Otherwise it just looks like the TPL code doesn't like being in a
single channel configuration. Does the 2GB model just forgo the second
ram chip? Or is this actually a 4GB model and it isn't detecting the
second chip in both downstream and mainline? Could you include the
TPL/SPL portion of downstream's output?
>
> I am yet to investigate the u-boot story in any detail.
>
> It's on my TODO list for today.
>
> > Is there a particular reason it isn't using Mainline TF-A?
>
> We're not using Trusted Firmware yet.
This platform does not work at all without TF-A. Optee is optional.
Either you are using the downstream blob from Rockchip or Mainline
built yourself. Personally I prefer using Mainline everything. If you
build Mainline U-Boot without TF-A it will throw a warning at the end
that says the created binary is non-functional.
>
> Although I'm starting to think this should be re-prioritised.
>
> > I've also run into issues on rk356x where the regulator powering a
> > power domain isn't powered due to a soft reset, which also causes
> > faults like this. Set your main regulators to always-on and see if it
> > helps with the issue.
>
> I'll do that. Thanks for the tip.
>
> Our main issue currently is an RCU-lock-up, again on soft reboot:
>
> [ 21.226951][ C0] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 21.227637][ C0] rcu: 5-...!: (1 GPs behind) idle=3de/1/0x4000000000000000 softirq=9/10 fqs=3 last_accelerate: 0000/efb9 dyntick_enabled: 0
> [ 21.228890][ C0] (detected by 0, t=5252 jiffies, g=-1167, q=46)
>
> Do you think these issues could all be related?
If you've powered down a minor power domain while a driver was still
active, sure.
>
> Thanks ever so much for your reply Peter.
>
> You've potentially saved us hours and hours of debugging.
Not a problem!
>
> Kind regards,
> Lee
>
> > > Note that the u-boot we're using is also quite old:
> > >
> > > U-Boot 2019.10-09248-g8511c75bb4 (Jan 08 2020 - 17:13:03 -0800)
> > >
> > > ... so this could easily be the root cause. The current plan is to
> > > try to update this ASAP. However early attempts are yet to result in
> > > a successful boot.
> > >
> > > I see that Brian recently added a few patches related to PD/DVFS, but
> > > again, these appear to be ATF related.
> > >
> > > Would anyone be able to shed some light onto this for me please?
> > >
> > > As always, any help would be gratefully received.
> > >
> > > Kind regards,
> > > Lee
> > >
> > > [0]
> > > Full reboot log can be seen at: https://pastebin.ubuntu.com/p/MjZP2V6kQ3/
> > >
> > > [ 0.699736][ T1] initcall __initstub__kmod_iommu__362_155_iommu_subsys_init4+0x0/0x8 returned 0 after 0 usecs
> > > [ 0.700737][ T1] calling __initstub__kmod_rockchip_iommu__348_1415_rk_iommu_init4+0x0/0x8 @ 1
> > > [ 0.702238][ C5] SError Interrupt on CPU5, code 0xbf000002 -- SError
> > > [ 0.702248][ C5] CPU: 5 PID: 48 Comm: kworker/5:1 Not tainted 5.15.35-android13-5-00092-g525d77310a20 #1
> > > [ 0.702261][ C5] Hardware name: Radxa ROCK Pi 4B (DT)
> > > [ 0.702266][ C5] Workqueue: pm genpd_power_off_work_fn.cfi_jt
> > > [ 0.702289][ C5] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > [ 0.702301][ C5] pc : regmap_mmio_read32le+0x14/0x2c
> > > [ 0.702318][ C5] lr : regmap_mmio_read+0x68/0xd0
> > > [ 0.702331][ C5] sp : ffffffc00b6d3b40
> > > [ 0.702335][ C5] x29: ffffffc00b6d3b40 x28: 0000000000000000 x27: 0000000000000000
> > > [ 0.702351][ C5] x26: ffffff8000923680 x25: ffffffc009abc2a0 x24: ffffff8000930c00
> > > [ 0.702364][ C5] x23: 0000000000000014 x22: ffffff8000930c00 x21: 0000000000000008
> > > [ 0.702378][ C5] x20: ffffff8000922300 x19: ffffff8000923680 x18: ffffffc00b66d058
> > > [ 0.702391][ C5] x17: 000000000000ba7e x16: ffffffc00a4dee04 x15: 000000000000b67e
> > > [ 0.702405][ C5] x14: 00000000028dd7a0 x13: 0000000000000040 x12: 0000000000000000
> > > [ 0.702419][ C5] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000005
> > > [ 0.702432][ C5] x8 : 0000000000000000 x7 : 00756d6d6f692e30 x6 : 3030383035366666
> > > [ 0.702445][ C5] x5 : 0000000000000001 x4 : 028dea248fba33d6 x3 : 0000000000000000
> > > [ 0.702457][ C5] x2 : ffffff8000923680 x1 : 0000000000000008 x0 : 0000000000000000
> > > [ 0.702472][ C5] Kernel panic - not syncing: Asynchronous SError Interrupt
> > > [ 0.702477][ C5] CPU: 5 PID: 48 Comm: kworker/5:1 Not tainted 5.15.35-android13-5-00092-g525d77310a20 #1
> > > [ 0.702487][ C5] Hardware name: Radxa ROCK Pi 4B (DT)
> > > [ 0.702492][ C5] Workqueue: pm genpd_power_off_work_fn.cfi_jt
> > > [ 0.702506][ C5] Call trace:
> > > [ 0.702508][ C5] dump_backtrace.cfi_jt+0x0/0x8
> > > [ 0.702525][ C5] dump_stack_lvl+0x80/0xb8
> > > [ 0.702536][ C5] panic+0x180/0x444
> > > [ 0.702547][ C5] arm64_serror_panic+0x1c0/0x210
> > > [ 0.702561][ C5] do_serror+0x17c/0x218
> > > [ 0.702572][ C5] el1h_64_error_handler+0x38/0x50
> > > [ 0.702581][ C5] el1h_64_error+0x7c/0x80
> > > [ 0.702589][ C5] regmap_mmio_read32le+0x14/0x2c
> > > [ 0.702603][ C5] _regmap_bus_reg_read+0x3c/0x90
> > > [ 0.702614][ C5] _regmap_read+0xb0/0x24c
> > > [ 0.702623][ C5] rockchip_pd_power+0x6c4/0xbc0
> > > [ 0.702638][ C5] rockchip_pd_power_off+0x18/0x28
> > > [ 0.702652][ C5] _genpd_power_off+0x178/0x388
> > > [ 0.702663][ C5] genpd_power_off+0x188/0x2e4
> > > [ 0.702673][ C5] genpd_power_off_work_fn+0x54/0xe4
> > > [ 0.702683][ C5] process_one_work+0x254/0x5a0
> > > [ 0.702696][ C5] worker_thread+0x3ec/0x920
> > > [ 0.702707][ C5] kthread+0x168/0x1dc
> > > [ 0.702716][ C5] ret_from_fork+0x10/0x20
> > > [ 0.702726][ C5] SMP: stopping secondary CPUs
> > >
> > >
> > > _______________________________________________
> > > Linux-rockchip mailing list
> > > Linux-rockchip at lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-rockchip
>
> --
> Lee Jones [李琼斯]
> Principal Technical Lead - Developer Services
> Linaro.org │ Open source software for Arm SoCs
> Follow Linaro: Facebook | Twitter | Blog
More information about the Linux-rockchip
mailing list