[PATCH] arm64: errata: Add NXP iMX8QM workaround for A53 Cache coherency issue

Frank Li frank.li at nxp.com
Mon Apr 17 07:48:50 PDT 2023



> -----Original Message-----
> From: Peng Fan (OSS) <peng.fan at oss.nxp.com>
> Sent: Sunday, April 16, 2023 10:08 PM
> To: Ivan T. Ivanov <iivanov at suse.de>; Catalin Marinas
> <catalin.marinas at arm.com>; Will Deacon <will at kernel.org>; Frank Li
> <frank.li at nxp.com>
> Cc: Mark Brown <broonie at kernel.org>; Mark Rutland
> <mark.rutland at arm.com>; Shawn Guo <shawnguo at kernel.org>; Aisheng
> Dong <aisheng.dong at nxp.com>; linux-arm-kernel at lists.infradead.org; dl-
> linux-imx <linux-imx at nxp.com>
> Subject: Re: [PATCH] arm64: errata: Add NXP iMX8QM workaround for A53
> Cache coherency issue
> 
> +Frank, who had worked on downstream solution.

+ Jason Liu, who actually work on this. 

Best regards
Frank Li

> 
> On 4/12/2023 8:55 PM, Ivan T. Ivanov wrote:
> > According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> > serious cache coherence issue. It was also mentioned in initial
> > support[2] for imx8qm mek machine.
> >
> > I chose to use an ALTERNATIVE() framework, instead downstream
> solution[3],
> > for this issue with the hope to reduce effect of this fix on unaffected
> > platforms.
> >
> > Unfortunately I was unable to find a way to identify SoC ID using
> > registers. Boot CPU MIDR_EL1 is equal to 0x410fd034. So I fallback to
> > using devicetree compatible strings for this.
> >
> > I know this fix is a suboptimal solution for affected machines, but I
> > haven't been able to come up with a less intrusive fix.  And I hope once
> > TLB caches are invalidated any immediate attempt to invalidate them again
> > will be close to NOP operation (flush_tlb_kernel_range())
> >
> > I have run few simple benchmarks and perf tests on affected and
> unaffected
> > machines and I was not able see any obvious issues. iMX8QM
> "performance"
> > was nearly doubled with 2 A72 bringed online.
> >
> > Following is excerpt from NXP IMX8_1N94W "Mask Set Errata" document
> > Rev. 5, 3/2023. Just in case it gets lost somehow.
> >
> > ---
> > "ERR050104: Arm/A53: Cache coherency issue"
> >
> > Description
> >
> > Some maintenance operations exchanged between the A53 and A72
> > core clusters, involving some Translation Look-aside Buffer
> > Invalidate (TLBI) and Instruction Cache (IC) instructions can
> > be corrupted. The upper bits, above bit-35, of ARADDR and ACADDR
> > buses within in Arm A53 sub-system have been incorrectly connected.
> > Therefore ARADDR and ACADDR address bits above bit-35 should not
> > be used.
> >
> > Workaround
> >
> > The following software instructions are required to be downgraded
> > to TLBI VMALLE1IS:  TLBI ASIDE1, TLBI ASIDE1IS, TLBI VAAE1,
> > TLBI VAAE1IS, TLBI VAALE1, TLBI VAALE1IS, TLBI VAE1, TLBI VAE1IS,
> > TLBI VALE1, TLBI VALE1IS
> >
> > The following software instructions are required to be downgraded
> > to TLBI VMALLS12E1IS: TLBI IPAS2E1IS, TLBI IPAS2LE1IS
> >
> > The following software instructions are required to be downgraded
> > to TLBI ALLE2IS: TLBI VAE2IS, TLBI VALE2IS.
> >
> > The following software instructions are required to be downgraded
> > to TLBI ALLE3IS: TLBI VAE3IS, TLBI VALE3IS.
> >
> > The following software instructions are required to be downgraded
> > to TLBI VMALLE1IS when the Force Broadcast (FB) bit [9] of the
> > Hypervisor Configuration Register (HCR_EL2) is set:
> > TLBI ASIDE1, TLBI VAAE1, TLBI VAALE1, TLBI VAE1, TLBI VALE1
> >
> > The following software instruction is required to be downgraded
> > to IC IALLUIS: IC IVAU, Xt
> >
> > Specifically for the IC IVAU, Xt downgrade, setting SCTLR_EL1.UCI
> > to 0 will disable EL0 access to this instruction. Any attempt to
> > execute from EL0 will generate an EL1 trap, where the downgrade to
> > IC ALLUIS can be implemented.
> > --
> >
> > [1] https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf
> > [2] 307fd14d4b14 ("arm64: dts: imx: add imx8qm mek support")
> > [3] https://github.com/nxp-imx/linux-imx/blob/lf-
> 6.1.y/arch/arm64/include/asm/tlbflush.h#L19
> >
> > Signed-off-by: Ivan T. Ivanov <iivanov at suse.de>
> > ---
> >   Documentation/arm64/silicon-errata.rst |  2 ++
> >   arch/arm64/Kconfig                     | 10 ++++++++++
> >   arch/arm64/include/asm/cpufeature.h    |  3 ++-
> >   arch/arm64/include/asm/tlbflush.h      |  6 +++++-
> >   arch/arm64/kernel/cpu_errata.c         | 18 ++++++++++++++++++
> >   arch/arm64/kernel/traps.c              | 22 +++++++++++++++++++++-
> >   arch/arm64/tools/cpucaps               |  1 +
> >   7 files changed, 59 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/arm64/silicon-errata.rst
> b/Documentation/arm64/silicon-errata.rst
> > index ec5f889d7681..fce231797184 100644
> > --- a/Documentation/arm64/silicon-errata.rst
> > +++ b/Documentation/arm64/silicon-errata.rst
> > @@ -175,6 +175,8 @@ stable kernels.
> >   +----------------+-----------------+-----------------+-----------------------------+
> >   | Freescale/NXP  | LS2080A/LS1043A | A-008585        |
> FSL_ERRATUM_A008585         |
> >   +----------------+-----------------+-----------------+-----------------------------+
> > +| Freescale/NXP  | i.MX 8QuadMax   | ERR050104       |
> NXP_IMX8QM_ERRATUM_ERR050104|
> > ++----------------+-----------------+-----------------+-----------------------------+
> >   +----------------+-----------------+-----------------+-----------------------------+
> >   | Hisilicon      | Hip0{5,6,7}     | #161010101      |
> HISILICON_ERRATUM_161010101 |
> >   +----------------+-----------------+-----------------+-----------------------------+
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 1023e896d46b..437cb53f8753 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1159,6 +1159,16 @@ config SOCIONEXT_SYNQUACER_PREITS
> >
> >   	  If unsure, say Y.
> >
> > +config NXP_IMX8QM_ERRATUM_ERR050104
> > +	bool "NXP iMX8QM: Workaround for Arm/A53 Cache coherency
> issue"
> > +	default n
> > +	help
> > +	  Some maintenance operations exchanged between the A53 and
> A72 core
> > +	  clusters, involving some Translation Look-aside Buffer Invalidate
> > +	  (TLBI) and Instruction Cache (IC) instructions can be corrupted.
> > +
> > +	  If unsure, say N.
> > +
> >   endmenu # "ARM errata workarounds via the alternatives framework"
> >
> >   choice
> > diff --git a/arch/arm64/include/asm/cpufeature.h
> b/arch/arm64/include/asm/cpufeature.h
> > index 6bf013fb110d..1ed648f7f29a 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -835,7 +835,8 @@ static inline bool system_supports_bti(void)
> >   static inline bool system_supports_tlb_range(void)
> >   {
> >   	return IS_ENABLED(CONFIG_ARM64_TLB_RANGE) &&
> > -		cpus_have_const_cap(ARM64_HAS_TLB_RANGE);
> > +		cpus_have_const_cap(ARM64_HAS_TLB_RANGE) &&
> > +
> 	!cpus_have_const_cap(ARM64_WORKAROUND_NXP_ERR050104);
> >   }
> >
> >   int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
> > diff --git a/arch/arm64/include/asm/tlbflush.h
> b/arch/arm64/include/asm/tlbflush.h
> > index 412a3b9a3c25..12055b859ce3 100644
> > --- a/arch/arm64/include/asm/tlbflush.h
> > +++ b/arch/arm64/include/asm/tlbflush.h
> > @@ -37,7 +37,11 @@
> >   			    : : )
> >
> >   #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE
> 	       \
> > -			       "tlbi " #op ", %0\n"			       \
> > +		   ALTERNATIVE("nop\n	nop\n  tlbi " #op ", %0",              \
> > +			       "tlbi vmalle1is\n  dsb ish\n  isb",	       \
> > +			       ARM64_WORKAROUND_NXP_ERR050104)
> 	       \
> > +			    : : "r" (arg));				       \
> > +			  asm (ARM64_ASM_PREAMBLE
> 	       \
> >   		   ALTERNATIVE("nop\n			nop",		       \
> >   			       "dsb ish\n		tlbi " #op ", %0",     \
> >   			       ARM64_WORKAROUND_REPEAT_TLBI,
> 	       \
> > diff --git a/arch/arm64/kernel/cpu_errata.c
> b/arch/arm64/kernel/cpu_errata.c
> > index 307faa2b4395..7b702a79bf60 100644
> > --- a/arch/arm64/kernel/cpu_errata.c
> > +++ b/arch/arm64/kernel/cpu_errata.c
> > @@ -8,6 +8,7 @@
> >   #include <linux/arm-smccc.h>
> >   #include <linux/types.h>
> >   #include <linux/cpu.h>
> > +#include <linux/of.h>
> >   #include <asm/cpu.h>
> >   #include <asm/cputype.h>
> >   #include <asm/cpufeature.h>
> > @@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities
> *entry, int scope)
> >   	return model == entry->midr_range.model;
> >   }
> >
> > +static bool __maybe_unused
> > +is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
> > +{
> > +	WARN_ON(preemptible());
> > +
> > +	return of_machine_is_compatible("fsl,imx8qm");
> > +}
> > +
> >   static bool
> >   has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
> >   			  int scope)
> > @@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[]
> = {
> >   		MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
> >   		.cpu_enable = cpu_clear_bf16_from_user_emulation,
> >   	},
> > +#endif
> > +#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
> > +	{
> > +		.desc = "NXP A53 cache coherency issue",
> > +		.capability = ARM64_WORKAROUND_NXP_ERR050104,
> > +		.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> > +		.matches = is_imx8qm_soc,
> > +		.cpu_enable = cpu_enable_cache_maint_trap,
> > +	},
> >   #endif
> >   	{
> >   	}
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index 4a79ba100799..4858f8c86fd5 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -529,6 +529,26 @@ void do_el1_fpac(struct pt_regs *regs, unsigned
> long esr)
> >   		uaccess_ttbr0_disable();			\
> >   	}
> >
> > +#define __user_instruction_cache_maint(address, res)		\
> > +do {								\
> > +	if (address >= TASK_SIZE_MAX) {				\
> > +		res = -EFAULT;					\
> > +	} else {						\
> > +		uaccess_ttbr0_enable();				\
> > +		asm volatile (					\
> > +			"1:\n"					\
> > +	    ALTERNATIVE("	ic	ivau, %1\n",		\
> > +			"	ic	ialluis\n",		\
> > +			ARM64_WORKAROUND_NXP_ERR050104)
> 	\
> > +			"	mov	%w0, #0\n"		\
> > +			"2:\n"					\
> > +			_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)	\
> > +			: "=r" (res)				\
> > +			: "r" (address));			\
> > +		uaccess_ttbr0_disable();			\
> > +	}							\
> > +} while (0)
> > +
> >   static void user_cache_maint_handler(unsigned long esr, struct pt_regs
> *regs)
> >   {
> >   	unsigned long tagged_address, address;
> > @@ -556,7 +576,7 @@ static void user_cache_maint_handler(unsigned
> long esr, struct pt_regs *regs)
> >   		__user_cache_maint("dc civac", address, ret);
> >   		break;
> >   	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> > -		__user_cache_maint("ic ivau", address, ret);
> > +		__user_instruction_cache_maint(address, ret);
> >   		break;
> >   	default:
> >   		force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
> > diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> > index 37b1340e9646..e225f1cd1005 100644
> > --- a/arch/arm64/tools/cpucaps
> > +++ b/arch/arm64/tools/cpucaps
> > @@ -90,3 +90,4 @@ WORKAROUND_NVIDIA_CARMEL_CNP
> >   WORKAROUND_QCOM_FALKOR_E1003
> >   WORKAROUND_REPEAT_TLBI
> >   WORKAROUND_SPECULATIVE_AT
> > +WORKAROUND_NXP_ERR050104


More information about the linux-arm-kernel mailing list