[PATCH v2] arm64: errata: Add NXP iMX8QM workaround for A53 Cache coherency issue
Ivan T. Ivanov
iivanov at suse.de
Thu Jun 8 06:39:29 PDT 2023
On 06-02 11:34, Will Deacon wrote:
>
> Hi Ivan,
>
> Sorry for the delay in getting to this.
>
> On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:
> > According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> > serious cache coherence issue. It was also mentioned in initial
> > support[2] for imx8qm mek machine.
> >
> > I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
> > for this issue with the hope to reduce effect of this fix on unaffected
> > platforms.
> >
> > Unfortunately I was unable to find a way to identify SoC ID using
> > registers. Boot CPU MIDR_EL1 is equal to 0x410fd034, AIDR_EL1 is
> > equal to 0. So I fallback to using devicetree compatible strings for this.
> > And because we can not guarantee that VMs on top of KVM will get
> > correct devicetree KVM is disabled.
> >
> > Also make sure that SMMU "Broadcast TLB Maintenance" is not used even
> > SMMU device build in this SoC supports it.
> >
> > I know this fix is a suboptimal solution for affected machines, but I
> > haven't been able to come up with a less intrusive fix. And I hope once
> > TLB caches are invalidated any immediate attempt to invalidate them again
> > will be close to NOP operation (flush_tlb_kernel_range())
>
> [...]
>
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 1023e896d46b..a5fe6ffb8150 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1159,6 +1159,18 @@ config SOCIONEXT_SYNQUACER_PREITS
> >
> > If unsure, say Y.
> >
> > +config NXP_IMX8QM_ERRATUM_ERR050104
> > + bool "NXP iMX8QM ERR050104: broken cache/tlb invalidation"
> > + default y
> > + help
> > + On iMX8QM, addresses above bit 35 are not broadcast correctly for
> > + TLBI or IC operations, making TLBI and IC unreliable.
> > +
> > + Work around this erratum by using TLBI *ALL*IS and IC IALLUIS
> > + operations. EL0 use of IC IVAU is trapped and upgraded to IC IALLUIS.
> > +
> > + If unsure, say Y.
> > +
> > endmenu # "ARM errata workarounds via the alternatives framework"
> >
> > choice
> > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> > index 412a3b9a3c25..e3bad2298ea5 100644
> > --- a/arch/arm64/include/asm/tlbflush.h
> > +++ b/arch/arm64/include/asm/tlbflush.h
> > @@ -37,7 +37,11 @@
> > : : )
> >
> > #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \
> > - "tlbi " #op ", %0\n" \
> > + ALTERNATIVE("tlbi " #op ", %0\n", \
> > + "tlbi vmalle1is\n", \
> > + ARM64_WORKAROUND_NXP_ERR050104) \
> > + : : "r" (arg)); \
> > + asm (ARM64_ASM_PREAMBLE \
> > ALTERNATIVE("nop\n nop", \
> > "dsb ish\n tlbi " #op ", %0", \
> > ARM64_WORKAROUND_REPEAT_TLBI, \
>
> I can see two problems with this:
>
> 1. It doesn't interact properly with the ARM64_WORKAROUND_REPEAT_TLBI
> workaround
>From what I see, they can not happen at the same time on the same SoC,
so this should not be an issue, but I could be missing something.
>
> 2. You're nuking the entire TLB in the low-level helper, so things like
> flush_tlb_range() are going to over-invalidate excessively
Right, this is noted in the commit message.
>
> Granted, you may not care about (1) for your SoC, but I would prefer not
> to introduce artificial constraints on the workaround. I think both of
> the issues above stem from plumbing this in too low in the stack.
>
> Can you instead use a static key to redirect the higher-level control flow
> to flush_tlb_mm() or flush_tlb_all() for user and kernel respectively?
This was the other option considered, but it was looking more intrusive.
> I'm
> assuming the ASID _is_ carried on the interconnect?
Looking at history of in tlbflush in the vendor tree [4] I think ASID
is not properly wired either [5]. But I am failing to see why this
should matter.
[4] https://github.com/nxp-imx/linux-imx/commits/lf-6.1.y/arch/arm64/include/asm/tlbflush.h
[5] https://github.com/nxp-imx/linux-imx/commit/db6c157fff2a39c378e41e63f78d6fba18578758
>
> > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> > index 307faa2b4395..b0647b64dbb8 100644
> > --- a/arch/arm64/kernel/cpu_errata.c
> > +++ b/arch/arm64/kernel/cpu_errata.c
> > @@ -8,6 +8,7 @@
> > #include <linux/arm-smccc.h>
> > #include <linux/types.h>
> > #include <linux/cpu.h>
> > +#include <linux/of.h>
> > #include <asm/cpu.h>
> > #include <asm/cputype.h>
> > #include <asm/cpufeature.h>
> > @@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
> > return model == entry->midr_range.model;
> > }
> >
> > +static bool __maybe_unused
> > +is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
> > +{
> > + WARN_ON(preemptible());
> > +
> > + return of_machine_is_compatible("fsl,imx8qm");
> > +}
> > +
> > static bool
> > has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
> > int scope)
> > @@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> > MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
> > .cpu_enable = cpu_clear_bf16_from_user_emulation,
> > },
> > +#endif
> > +#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
> > + {
> > + .desc = "NXP erratum ERR050104",
> > + .capability = ARM64_WORKAROUND_NXP_ERR050104,
> > + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> > + .matches = is_imx8qm_soc,
> > + .cpu_enable = cpu_enable_cache_maint_trap,
> > + },
> > #endif
> > {
> > }
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index 4a79ba100799..265b6334291b 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
> > __user_cache_maint("dc civac", address, ret);
> > break;
> > case ESR_ELx_SYS64_ISS_CRM_IC_IVAU: /* IC IVAU */
> > + if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> > + asm volatile("ic ialluis");
>
> Hmm, one oddity here is that you can pass a faulting address and not see
> the fault. It looks like that's already IMP DEF, so it's probably ok, but
> might be worth a comment.
I am not sure what should be expected behavior, but I could
add comment, sure.
>
> > + ret = 0;
>
> 'ret' is initialised to 0 already?
Oh, yes.
>
> Finally, how come you don't need to upgrade I-cache invalidation by-VA
> in the kernel? It looks like you're only handling operations trapped
> from EL0.
Hm, I was thinking that __tlbi() is taking care for this or you mean
something else, like locations in assembler.h?
Regards,
Ivan
More information about the linux-arm-kernel
mailing list