[PATCH v2] arm64: errata: Add NXP iMX8QM workaround for A53 Cache coherency issue

Ivan T. Ivanov iivanov at suse.de
Thu Apr 20 04:29:52 PDT 2023


According to NXP errata document[1] i.MX8QuadMax SoC suffers from
serious cache coherence issue. It was also mentioned in initial
support[2] for imx8qm mek machine.

I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
for this issue with the hope to reduce effect of this fix on unaffected
platforms.

Unfortunately I was unable to find a way to identify SoC ID using
registers. Boot CPU MIDR_EL1 is equal to 0x410fd034, AIDR_EL1 is
equal to 0. So I fallback to using devicetree compatible strings for this.
And because we can not guarantee that VMs on top of KVM will get
correct devicetree KVM is disabled.

Also make sure that SMMU "Broadcast TLB Maintenance" is not used even
SMMU device build in this SoC supports it.

I know this fix is a suboptimal solution for affected machines, but I
haven't been able to come up with a less intrusive fix.  And I hope once
TLB caches are invalidated any immediate attempt to invalidate them again
will be close to NOP operation (flush_tlb_kernel_range())

I have run few simple benchmarks and perf tests on affected and unaffected
machines and I was not able see any obvious issues. iMX8QM "performance"
was nearly doubled with 2 A72 bringed online.

Following is excerpt from NXP IMX8_1N94W "Mask Set Errata" document
Rev. 5, 3/2023. Just in case it gets lost somehow.

--
"ERR050104: Arm/A53: Cache coherency issue"

Description

Some maintenance operations exchanged between the A53 and A72
core clusters, involving some Translation Look-aside Buffer
Invalidate (TLBI) and Instruction Cache (IC) instructions can
be corrupted. The upper bits, above bit-35, of ARADDR and ACADDR
buses within in Arm A53 sub-system have been incorrectly connected.
Therefore ARADDR and ACADDR address bits above bit-35 should not
be used.

Workaround

The following software instructions are required to be downgraded
to TLBI VMALLE1IS:  TLBI ASIDE1, TLBI ASIDE1IS, TLBI VAAE1,
TLBI VAAE1IS, TLBI VAALE1, TLBI VAALE1IS, TLBI VAE1, TLBI VAE1IS,
TLBI VALE1, TLBI VALE1IS

The following software instructions are required to be downgraded
to TLBI VMALLS12E1IS: TLBI IPAS2E1IS, TLBI IPAS2LE1IS

The following software instructions are required to be downgraded
to TLBI ALLE2IS: TLBI VAE2IS, TLBI VALE2IS.

The following software instructions are required to be downgraded
to TLBI ALLE3IS: TLBI VAE3IS, TLBI VALE3IS.

The following software instructions are required to be downgraded
to TLBI VMALLE1IS when the Force Broadcast (FB) bit [9] of the
Hypervisor Configuration Register (HCR_EL2) is set:
TLBI ASIDE1, TLBI VAAE1, TLBI VAALE1, TLBI VAE1, TLBI VALE1

The following software instruction is required to be downgraded
to IC IALLUIS: IC IVAU, Xt

Specifically for the IC IVAU, Xt downgrade, setting SCTLR_EL1.UCI
to 0 will disable EL0 access to this instruction. Any attempt to
execute from EL0 will generate an EL1 trap, where the downgrade to
IC ALLUIS can be implemented.
--

[1] https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf
[2] commit 307fd14d4b14c ("arm64: dts: imx: add imx8qm mek support")
[3] https://github.com/nxp-imx/linux-imx/blob/lf-6.1.y/arch/arm64/include/asm/tlbflush.h#L19

Signed-off-by: Ivan T. Ivanov <iivanov at suse.de>
---

Changes since v1:
https://lore.kernel.org/linux-arm-kernel/20230412125506.21634-1-iivanov@suse.de/

 - Disable KVM on affected SoC add a note in commit message why this is done.
 - Updated Kconfig help message and made option enabled by default, like
   the rest of the workarounds.
 - Removed unnecessary DSB ISH + ISB instructions from the ALTERNATIVES.
 - Reworked handling of user-space IC IVAU.
 - Adjusted position of workaround definition in cpucaps file
 - Make sure that "Broadcast TLB Maintenance" is not used even if SMMU device
   build in this SoC supports it. And note it in commit message.

 Documentation/arm64/silicon-errata.rst |  2 ++
 arch/arm64/Kconfig                     | 12 ++++++++++++
 arch/arm64/include/asm/tlbflush.h      |  6 +++++-
 arch/arm64/kernel/cpu_errata.c         | 18 ++++++++++++++++++
 arch/arm64/kernel/traps.c              |  5 +++++
 arch/arm64/kvm/arm.c                   |  5 +++++
 arch/arm64/tools/cpucaps               |  1 +
 drivers/iommu/arm/arm-smmu/arm-smmu.c  |  4 ++++
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  1 +
 9 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index ec5f889d7681..fce231797184 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -175,6 +175,8 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585         |
 +----------------+-----------------+-----------------+-----------------------------+
+| Freescale/NXP  | i.MX 8QuadMax   | ERR050104       | NXP_IMX8QM_ERRATUM_ERR050104|
++----------------+-----------------+-----------------+-----------------------------+
 +----------------+-----------------+-----------------+-----------------------------+
 | Hisilicon      | Hip0{5,6,7}     | #161010101      | HISILICON_ERRATUM_161010101 |
 +----------------+-----------------+-----------------+-----------------------------+
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1023e896d46b..a5fe6ffb8150 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1159,6 +1159,18 @@ config SOCIONEXT_SYNQUACER_PREITS
 
 	  If unsure, say Y.
 
+config NXP_IMX8QM_ERRATUM_ERR050104
+	bool "NXP iMX8QM ERR050104: broken cache/tlb invalidation"
+	default y
+	help
+	  On iMX8QM, addresses above bit 35 are not broadcast correctly for
+	  TLBI or IC operations, making TLBI and IC unreliable.
+
+	  Work around this erratum by using TLBI *ALL*IS and IC IALLUIS
+	  operations. EL0 use of IC IVAU is trapped and upgraded to IC IALLUIS.
+
+	  If unsure, say Y.
+
 endmenu # "ARM errata workarounds via the alternatives framework"
 
 choice
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 412a3b9a3c25..e3bad2298ea5 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -37,7 +37,11 @@
 			    : : )
 
 #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE			       \
-			       "tlbi " #op ", %0\n"			       \
+		   ALTERNATIVE("tlbi " #op ", %0\n",			       \
+			       "tlbi vmalle1is\n",			       \
+			       ARM64_WORKAROUND_NXP_ERR050104)		       \
+			    : : "r" (arg));				       \
+			  asm (ARM64_ASM_PREAMBLE			       \
 		   ALTERNATIVE("nop\n			nop",		       \
 			       "dsb ish\n		tlbi " #op ", %0",     \
 			       ARM64_WORKAROUND_REPEAT_TLBI,		       \
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 307faa2b4395..b0647b64dbb8 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -8,6 +8,7 @@
 #include <linux/arm-smccc.h>
 #include <linux/types.h>
 #include <linux/cpu.h>
+#include <linux/of.h>
 #include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/cpufeature.h>
@@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
 	return model == entry->midr_range.model;
 }
 
+static bool __maybe_unused
+is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	WARN_ON(preemptible());
+
+	return of_machine_is_compatible("fsl,imx8qm");
+}
+
 static bool
 has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
 			  int scope)
@@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
 		.cpu_enable = cpu_clear_bf16_from_user_emulation,
 	},
+#endif
+#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
+	{
+		.desc = "NXP erratum ERR050104",
+		.capability = ARM64_WORKAROUND_NXP_ERR050104,
+		.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
+		.matches = is_imx8qm_soc,
+		.cpu_enable = cpu_enable_cache_maint_trap,
+	},
 #endif
 	{
 	}
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 4a79ba100799..265b6334291b 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
 		__user_cache_maint("dc civac", address, ret);
 		break;
 	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
+		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
+			asm volatile("ic ialluis");
+			ret = 0;
+			break;
+		}
 		__user_cache_maint("ic ivau", address, ret);
 		break;
 	default:
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 4b2e16e696a8..5066332302d2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2239,6 +2239,11 @@ static __init int kvm_arm_init(void)
 		return -ENODEV;
 	}
 
+	if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
+		kvm_info("KVM is not supported on this system\n");
+		return -ENODEV;
+	}
+
 	err = kvm_sys_reg_table_init();
 	if (err) {
 		kvm_info("Error initializing system register tables");
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 37b1340e9646..c1de9235f922 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -87,6 +87,7 @@ WORKAROUND_CAVIUM_TX2_219_TVM
 WORKAROUND_CLEAN_CACHE
 WORKAROUND_DEVICE_LOAD_ACQUIRE
 WORKAROUND_NVIDIA_CARMEL_CNP
+WORKAROUND_NXP_ERR050104
 WORKAROUND_QCOM_FALKOR_E1003
 WORKAROUND_REPEAT_TLBI
 WORKAROUND_SPECULATIVE_AT
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 2ff7a72cf377..29bbb16bae6e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1680,6 +1680,10 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	/* ID0 */
 	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID0);
 
+	/* Do not use Broadcast TLB Maintenance on affected platforms */
+	if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104))
+		id &= ~ARM_SMMU_ID0_BTM;
+
 	/* Restrict available stages based on module parameter */
 	if (force_stage == 1)
 		id &= ~(ARM_SMMU_ID0_S2TS | ARM_SMMU_ID0_NTS);
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 703fd5817ec1..1589acfdbed9 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -52,6 +52,7 @@
 #define ARM_SMMU_ID0_PTFS_NO_AARCH32S	BIT(24)
 #define ARM_SMMU_ID0_NUMIRPT		GENMASK(23, 16)
 #define ARM_SMMU_ID0_CTTW		BIT(14)
+#define ARM_SMMU_ID0_BTM		BIT(13)
 #define ARM_SMMU_ID0_NUMSIDB		GENMASK(12, 9)
 #define ARM_SMMU_ID0_EXIDS		BIT(8)
 #define ARM_SMMU_ID0_NUMSMRG		GENMASK(7, 0)
-- 
2.35.3




More information about the linux-arm-kernel mailing list