[PATCH 2/2] ARM: optee-early: invalidate caches before jump to OP-TEE
Lucas Stach
l.stach at pengutronix.de
Tue Jun 3 02:57:21 PDT 2025
Hi Fabian,
Am Dienstag, dem 03.06.2025 um 11:20 +0200 schrieb Fabian Pflug:
> The optee-early code was initially added for i.MX6UL. Trying to naively
> enable it on an i.MX6Q boards was observed to cause spurious hangs on
> return from OP-TEE to barebox.
>
> The root cause seems to be inadequate cache handling by OP-TEE: OP-TEE
> enables the MMU and caches with it, but didn't take care to invalidate
> all cache lines before enabling the MMU, which triggered the
> aforementioned hangs.
>
> To paper over this issue, let's just invalidate the cache lines on the
> barebox side instead before jumping to OP-TEE. This issue did likely not
> affect the original i.MX6UL, because its Cortex-A7 has an architected L2
> cache that's guaranteed zeroed (no dirty cache lines) on power-on reset,
> unlike the i.MX6Q's Cortex-A9, where the external L2 cache powers on
> with unpredictable content including the dirty bits.
>
The explanation here doesn't make too much sense to me. I don't think
the outer L2 cache is even enabled at this point, but even if it were
arm_early_mmu_cache_invalidate() only handles architected caches, so it
wouldn't affect the PL310 on the i.MX6Q/DL.
The real issue with the Cortex A9 caches is that the tags aren't
cleared on power-up, so some sets/ways may end up in "valid" state if
not explicitly invalidated. Thus any write to memory may get stuck in
the cache, even if caching is disabled, as this knob only turns off
allocation in the cache, but doesn't prevent updates of such bogus
valid lines. If you then proceed to invalidate the cache, you may
discard data that has not yet reached DRAM. So IMO this fix here seems
risky, as it assumes that there have been no writes to memory that are
worth keeping before calling start_optee_early(). While this might be
the case in the current implementation, this assumption is quite non-
obvious to someone just looking at the individual functions.
The stuck writes are also why OP-TEE is unable to handle this itself:
any cache invalidation there would risk discarding writes from software
running before OP-TEE. So the only way to handle this properly is to
invalidate the caches before issuing any writes.
I guess it would be much better to simply have the
arm_early_mmu_cache_invalidate() as part of the Cortex A9 lowlevel CPU
initialization at the very start of the PBL entry.
Regards,
Lucas
> This means on e.g. the i.MX6UL, we will now do one extra cache invalidation
> that's not needed. This should be negligible and we are already had an
> unconditional invalidation in __barebox_arm_entry.
>
> Note that this is a different implementation than what we do on ARM64,
> there we load TF-A before it jumps to OP-TEE and assuming
> non-architected caches or caches with uninitialized content on power-on
> to be a dying breed, our ARM64 implementation is likely not affected.
>
> Co-authored-by: Ahmad Fatoum <a.fatoum at pengutronix.de>
> Signed-off-by: Ahmad Fatoum <a.fatoum at pengutronix.de>
> Signed-off-by: Fabian Pflug <f.pflug at pengutronix.de>
> ---
> arch/arm/lib32/optee-early.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/arch/arm/lib32/optee-early.c b/arch/arm/lib32/optee-early.c
> index 0cda0ab163..b1dba67d42 100644
> --- a/arch/arm/lib32/optee-early.c
> +++ b/arch/arm/lib32/optee-early.c
> @@ -35,6 +35,19 @@ int start_optee_early(void *fdt, void *tee)
> /* We use setjmp/longjmp here because OP-TEE clobbers most registers */
> ret = setjmp(tee_buf);
> if (ret == 0) {
> + /*
> + * At least OP-TEE v4.1.0 seems to not invalidate all dirty cache
> + * lines before enabling the MMU. This can lead to spurious hangs
> + * on return to barebox on systems where there might be left-over
> + * dirty cache lines, whether from BootROM or because L2 cache
> + * is non-architected and powers on with unpredictable content
> + * like is the case with PL310 on i.MX6Q.
> + *
> + * Let's invalidate the caches here, so board entry points need
> + * not bother.
> + */
> + arm_early_mmu_cache_invalidate();
> +
> tee_start(0, 0, fdt);
> longjmp(tee_buf, 1);
> }
More information about the barebox
mailing list