[PATCH v2 11/17] arm64: errata: Add workaround for TSB flush failures
Suzuki K Poulose
suzuki.poulose at arm.com
Wed Sep 22 05:03:41 PDT 2021
Hi Anshuman
On 22/09/2021 08:39, Anshuman Khandual wrote:
>
>
> On 9/21/21 7:11 PM, Suzuki K Poulose wrote:
>> Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
>> from errata, where a TSB (trace synchronization barrier)
>> fails to flush the trace data completely, when executed from
>> a trace prohibited region. In Linux we always execute it
>> after we have moved the PE to trace prohibited region. So,
>> we can apply the workaround everytime a TSB is executed.
>
> s/everytime/every time
Ack
>
>>
>> The work around is to issue two TSB consecutively.
>>
>> NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
>> that a late CPU could be blocked from booting if it is the
>> first CPU that requires the workaround. This is because we
>> do not allow setting a cpu_hwcaps after the SMP boot. The
>> other alternative is to use "this_cpu_has_cap()" instead
>> of the faster system wide check, which may be a bit of an
>> overhead, given we may have to do this in nvhe KVM host
>> before a guest entry.
>>
>> Cc: Will Deacon <will at kernel.org>
>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>> Cc: Mathieu Poirier <mathieu.poirier at linaro.org>
>> Cc: Mike Leach <mike.leach at linaro.org>
>> Cc: Mark Rutland <mark.rutland at arm.com>
>> Cc: Anshuman Khandual <anshuman.khandual at arm.com>
>> Cc: Marc Zyngier <maz at kernel.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose at arm.com>
>> ---
...
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index eac4030322df..0764774e12bb 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -705,6 +705,37 @@ config ARM64_ERRATUM_2139208
>>
>> If unsure, say Y.
>>
>> +config ARM64_WORKAROUND_TSB_FLUSH_FAILURE
>> + bool
>> +
>> +config ARM64_ERRATUM_2054223
>> + bool "Cortex-A710: 2054223: workaround TSB instruction failing to flush trace"
>> + default y
>> + help
>> + Enable workaround for ARM Cortex-A710 erratum 2054223
>> +
>> + Affected cores may fail to flush the trace data on a TSB instruction, when
>> + the PE is in trace prohibited state. This will cause losing a few bytes
>> + of the trace cached.
>> +
>> + Workaround is to issue two TSB consecutively on affected cores.
>> +
>> + If unsure, say Y.
>> +
>> +config ARM64_ERRATUM_2067961
>> + bool "Neoverse-N2: 2067961: workaround TSB instruction failing to flush trace"
>> + default y
>> + help
>> + Enable workaround for ARM Neoverse-N2 erratum 2067961
>> +
>> + Affected cores may fail to flush the trace data on a TSB instruction, when
>> + the PE is in trace prohibited state. This will cause losing a few bytes
>> + of the trace cached.
>> +
>> + Workaround is to issue two TSB consecutively on affected cores.
>
> Like I had mentioned in the previous patch, these descriptions here could
> be just factored out inside ARM64_WORKAROUND_TSB_FLUSH_FAILURE instead.
Please see my response there.
>
>> +
>> + If unsure, say Y.
>> +
>> config CAVIUM_ERRATUM_22375
>> bool "Cavium erratum 22375, 24313"
>> default y
>> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>> index 451e11e5fd23..1c5a00598458 100644
>> --- a/arch/arm64/include/asm/barrier.h
>> +++ b/arch/arm64/include/asm/barrier.h
>> @@ -23,7 +23,7 @@
>> #define dsb(opt) asm volatile("dsb " #opt : : : "memory")
>>
>> #define psb_csync() asm volatile("hint #17" : : : "memory")
>> -#define tsb_csync() asm volatile("hint #18" : : : "memory")
>> +#define __tsb_csync() asm volatile("hint #18" : : : "memory")
>> #define csdb() asm volatile("hint #20" : : : "memory")
>>
>> #ifdef CONFIG_ARM64_PSEUDO_NMI
>> @@ -46,6 +46,20 @@
>> #define dma_rmb() dmb(oshld)
>> #define dma_wmb() dmb(oshst)
>>
>> +
>> +#define tsb_csync() \
>> + do { \
>> + /* \
>> + * CPUs affected by Arm Erratum 2054223 or 2067961 needs \
>> + * another TSB to ensure the trace is flushed. The barriers \
>> + * don't have to be strictly back to back, as long as the \
>> + * CPU is in trace prohibited state. \
>> + */ \
>> + if (cpus_have_final_cap(ARM64_WORKAROUND_TSB_FLUSH_FAILURE)) \
>> + __tsb_csync(); \
>> + __tsb_csync(); \
>> + } while (0)
>> +
>> /*
>> * Generate a mask for array_index__nospec() that is ~0UL when 0 <= idx < sz
>> * and 0 otherwise.
>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>> index ccd757373f36..bdbeac75ead6 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -352,6 +352,18 @@ static const struct midr_range trbe_overwrite_fill_mode_cpus[] = {
>> };
>> #endif /* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
>>
>> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
>> +static const struct midr_range tsb_flush_fail_cpus[] = {
>> +#ifdef CONFIG_ARM64_ERRATUM_2067961
>> + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
>> +#endif
>> +#ifdef CONFIG_ARM64_ERRATUM_2054223
>> + MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
>> +#endif
>> + {},
>> +};
>> +#endif /* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
>> +
>> const struct arm64_cpu_capabilities arm64_errata[] = {
>> #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
>> {
>> @@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>> .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>> CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
>> },
>> +#endif
>> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
>> + {
>> + .desc = "ARM erratum 2067961 or 2054223",
>> + .capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE,
>> + ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus),
>> + },
>> #endif
>> {
>> }
>> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
>> index 1ccb92165bd8..2102e15af43d 100644
>> --- a/arch/arm64/tools/cpucaps
>> +++ b/arch/arm64/tools/cpucaps
>> @@ -54,6 +54,7 @@ WORKAROUND_1463225
>> WORKAROUND_1508412
>> WORKAROUND_1542419
>> WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>> +WORKAROUND_TSB_FLUSH_FAILURE
>> WORKAROUND_CAVIUM_23154
>> WORKAROUND_CAVIUM_27456
>> WORKAROUND_CAVIUM_30115
>>
>
> This adds all the required bits of these erratas in a single patch,
> where as the previous work around had split all the required pieces
> into multiple patches. Could we instead follow the same standard in
> both the places ?
We could do this for this particular erratum as the work around is
within the arm64 kernel code, unlike the other ones - where the TRBE
driver needs a change.
So, there is a kind of dependency for the other two, which we don't
in this particular case.
i.e, TRBE driver needs a cpucap number to implement the work around ->
The arm64 kernel must define one, which we cant advertise yet until
we have a TRBE work around.
Thus, they follow a 3 step model.
- Define CPUCAP erratum
- TRBE driver work around
- Finally advertise to the user.
I don't think this one needs that.
Suzuki
>
More information about the linux-arm-kernel
mailing list