[PATCH 3/4] coresight: trbe: Work around the invalid prohibited states
Suzuki K Poulose
suzuki.poulose at arm.com
Thu Jan 6 01:40:15 PST 2022
On 06/01/2022 09:10, Anshuman Khandual wrote:
>
>
> On 1/5/22 7:24 PM, Suzuki K Poulose wrote:
>> On 05/01/2022 11:16, Anshuman Khandual wrote:
>>>
>>>
>>> On 1/5/22 3:43 PM, Suzuki K Poulose wrote:
>>>> Hi Anshuman
>>>>
>>>> On 05/01/2022 05:05, Anshuman Khandual wrote:
>>>>> TRBE implementations affected by Arm erratum #2038923 might get TRBE into
>>>>> an inconsistent view on whether trace is prohibited within the CPU. As a
>>>>> result, the trace buffer or trace buffer state might be corrupted. This
>>>>> happens after TRBE buffer has been enabled by setting TRBLIMITR_EL1.E,
>>>>> followed by just a single context synchronization event before execution
>>>>> changes from a context, in which trace is prohibited to one where it isn't,
>>>>> or vice versa. In these mentioned conditions, the view of whether trace is
>>>>> prohibited is inconsistent between parts of the CPU, and the trace buffer
>>>>> or the trace buffer state might be corrupted.
>>>>>
>>>>> Work around this problem in the TRBE driver by preventing an inconsistent
>>>>> view of whether the trace is prohibited or not based on TRBLIMITR_EL1.E by
>>>>> immediately following a change to TRBLIMITR_EL1.E with at least one ISB
>>>>> instruction before an ERET, or two ISB instructions if no ERET is to take
>>>>> place. This adds a new cpu errata in arm64 errata framework and also
>>>>> updates TRBE driver as required.
>>>>>
>>>>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>>>>> Cc: Will Deacon <will at kernel.org>
>>>>> Cc: Mathieu Poirier <mathieu.poirier at linaro.org>
>>>>> Cc: Suzuki Poulose <suzuki.poulose at arm.com>
>>>>> Cc: coresight at lists.linaro.org
>>>>> Cc: linux-doc at vger.kernel.org
>>>>> Cc: linux-arm-kernel at lists.infradead.org
>>>>> Cc: linux-kernel at vger.kernel.org
>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual at arm.com>
>>>>> ---
>>>>> Documentation/arm64/silicon-errata.rst | 2 +
>>>>> arch/arm64/Kconfig | 23 ++++++++++
>>>>> arch/arm64/kernel/cpu_errata.c | 9 ++++
>>>>> arch/arm64/tools/cpucaps | 1 +
>>>>> drivers/hwtracing/coresight/coresight-trbe.c | 47 +++++++++++++++-----
>>>>> 5 files changed, 72 insertions(+), 10 deletions(-)
>>>>
>>>> As with the previous patch, it may be a good idea to split the
>>>> patch to arm64 and trbe parts.
>>>
>>> Sure, will do.
>>>
>>>>
>>>>>
>>>>> diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
>>>>> index c9b30e6c2b6c..e0ef3e9a4b8b 100644
>>>>> --- a/Documentation/arm64/silicon-errata.rst
>>>>> +++ b/Documentation/arm64/silicon-errata.rst
>>>>> @@ -54,6 +54,8 @@ stable kernels.
>>>>> +----------------+-----------------+-----------------+-----------------------------+
>>>>> | ARM | Cortex-A510 | #2064142 | ARM64_ERRATUM_2064142 |
>>>>> +----------------+-----------------+-----------------+-----------------------------+
>>>>> +| ARM | Cortex-A510 | #2038923 | ARM64_ERRATUM_2038923 |
>>>>> ++----------------+-----------------+-----------------+-----------------------------+
>>>>> | ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
>>>>> +----------------+-----------------+-----------------+-----------------------------+
>>>>> | ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
>>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>> index 2105b68d88db..026e34fb6fad 100644
>>>>> --- a/arch/arm64/Kconfig
>>>>> +++ b/arch/arm64/Kconfig
>>>>> @@ -796,6 +796,29 @@ config ARM64_ERRATUM_2064142
>>>>> If unsure, say Y.
>>>>> +config ARM64_ERRATUM_2038923
>>>>> + bool "Cortex-A510: 2038923: workaround TRBE corruption with enable"
>>>>> + depends on CORESIGHT_TRBE
>>>>> + default y
>>>>> + help
>>>>> + This option adds the workaround for ARM Cortex-A510 erratum 2038923.
>>>>> +
>>>>> + Affected Cortex-A510 core might cause an inconsistent view on whether trace is
>>>>> + prohibited within the CPU. As a result, the trace buffer or trace buffer state
>>>>> + might be corrupted. This happens after TRBE buffer has been enabled by setting
>>>>> + TRBLIMITR_EL1.E, followed by just a single context synchronization event before
>>>>> + execution changes from a context, in which trace is prohibited to one where it
>>>>> + isn't, or vice versa. In these mentioned conditions, the view of whether trace
>>>>> + is prohibited is inconsistent between parts of the CPU, and the trace buffer or
>>>>> + the trace buffer state might be corrupted.
>>>>> +
>>>>> + Work around this in the driver by preventing an inconsistent view of whether the
>>>>> + trace is prohibited or not based on TRBLIMITR_EL1.E by immediately following a
>>>>> + change to TRBLIMITR_EL1.E with at least one ISB instruction before an ERET, or
>>>>> + two ISB instructions if no ERET is to take place.
>>>>> +
>>>>> + If unsure, say Y.
>>>>> +
>>>>> config CAVIUM_ERRATUM_22375
>>>>> bool "Cavium erratum 22375, 24313"
>>>>> default y
>>>>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>>>>> index cbb7d5a9aee7..60b0c1f1d912 100644
>>>>> --- a/arch/arm64/kernel/cpu_errata.c
>>>>> +++ b/arch/arm64/kernel/cpu_errata.c
>>>>> @@ -607,6 +607,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>>>>> ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2)
>>>>> },
>>>>> #endif
>>>>> +#ifdef CONFIG_ARM64_ERRATUM_2038923
>>>>> + {
>>>>> + .desc = "ARM erratum 2038923",
>>>>> + .capability = ARM64_WORKAROUND_2038923,
>>>>> +
>>>>> + /* Cortex-A510 r0p0 - r0p2 */
>>>>> + ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2)
>>>>> + },
>>>>> +#endif
>>>>> {
>>>>> }
>>>>> };
>>>>> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
>>>>> index fca3cb329e1d..45a06d36d080 100644
>>>>> --- a/arch/arm64/tools/cpucaps
>>>>> +++ b/arch/arm64/tools/cpucaps
>>>>> @@ -56,6 +56,7 @@ WORKAROUND_1463225
>>>>> WORKAROUND_1508412
>>>>> WORKAROUND_1542419
>>>>> WORKAROUND_2064142
>>>>> +WORKAROUND_2038923
>>>>> WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>>>>> WORKAROUND_TSB_FLUSH_FAILURE
>>>>> WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
>>>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>>>>> index ec24b62b2cec..0689c6dab96d 100644
>>>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>>>> @@ -92,11 +92,13 @@ struct trbe_buf {
>>>>> #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE 0
>>>>> #define TRBE_WORKAROUND_WRITE_OUT_OF_RANGE 1
>>>>> #define TRBE_WORKAROUND_SYSREG_WRITE_FAILURE 2
>>>>> +#define TRBE_WORKAROUND_CORRUPTION_WITH_ENABLE 3
>>>>> static int trbe_errata_cpucaps[] = {
>>>>> [TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
>>>>> [TRBE_WORKAROUND_WRITE_OUT_OF_RANGE] = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE,
>>>>> [TRBE_WORKAROUND_SYSREG_WRITE_FAILURE] = ARM64_WORKAROUND_2064142,
>>>>> + [TRBE_WORKAROUND_CORRUPTION_WITH_ENABLE] = ARM64_WORKAROUND_2038923,
>>>>> -1, /* Sentinel, must be the last entry */
>>>>> };
>>>>> @@ -174,6 +176,11 @@ static inline bool trbe_may_fail_sysreg_write(struct trbe_cpudata *cpudata)
>>>>> return trbe_has_erratum(cpudata, TRBE_WORKAROUND_SYSREG_WRITE_FAILURE);
>>>>> }
>>>>> +static inline bool trbe_may_corrupt_with_enable(struct trbe_cpudata *cpudata)
>>>>> +{
>>>>
>>>> minor nit: trbe_needs_{ctxt_sync, isb}_after_enable() ?
>>>
>>> trbe_needs_ctxt_sync_after_enable() sounds better. Also will have to change
>>> the index above as well .. TRBE_NEEDS_CTXT_SYNC_AFTER_ENABLE.
>>>
>>>>
>>>>> + return trbe_has_erratum(cpudata, TRBE_WORKAROUND_CORRUPTION_WITH_ENABLE);
>>>>> +}
>>>>> +
>>>>> static int trbe_alloc_node(struct perf_event *event)
>>>>> {
>>>>> if (event->cpu == -1)
>>>>> @@ -187,6 +194,30 @@ static inline void trbe_drain_buffer(void)
>>>>> dsb(nsh);
>>>>> }
>>>>> +static inline void set_trbe_enabled(struct trbe_cpudata *cpudata)
>>>>> +{
>>>>> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
>>>>
>>>> minor nit: This implies we do the TRBE programming in the following
>>>> manner in the common case (i.e, TRBE enabled in the beginning of a
>>>> session).
>>>> -> set TRBE LIMIT
>>>> -> read TRBE LIMIT
>>>> -> set TRBE ENABLED
>>>>
>>>> Could we please optimize this ? I believe the buf->trbe_limit
>>>> must hold the LIMITR value at any point in time. And thus this
>>>
>>> But is not bit risky though ! We have got the following places where
>>> given trbe_limit instance changes its value.
>>>
>>> drivers/../coresight-trbe.c: buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
>>> drivers/../coresight-trbe.c: buf->trbe_limit = compute_trbe_buffer_limit(handle);
>>> drivers/../coresight-trbe.c: buf->trbe_limit -= PAGE_SIZE;
>>
>> Those are the places where we compute the trbe_limit, *before*
>> we enable the TRBE. And we don't change recompute the limit
>> *without disabling* the TRBE. To make it more clear, the
>> only place where we set TRBE enabled without "computing"
>> the trbe_limit is when we hit a spurious interrupt.
>> But the value in the TRBLIMITR should already match the
>> buf->trbe_limit and we are only going to re-enable the
>> TRBE with the same limit. The other option is to
>> pass down the "limit" to the set_trbe_enabled().
>
> Since there are just two instances where set_trbe_enabled() gets
> called, passing down an additional parameter 'trblimitr' should
> still be okay. Some additional code change (like the following)
> will achieve this. Does this look okay ?
>
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -201,10 +201,8 @@ static inline void trbe_drain_buffer(void)
> dsb(nsh);
> }
>
> -static inline void set_trbe_enabled(struct trbe_cpudata *cpudata)
> +static inline void set_trbe_enabled(struct trbe_cpudata *cpudata, u64 trblimitr)
> {
> - u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> -
> /*
> * Enable the TRBE without clearing LIMITPTR which
> * might be required for fetching the buffer limits.
> @@ -626,7 +624,7 @@ static void set_trbe_limit_pointer_enabled(struct trbe_buf *buf)
> trblimitr |= (addr & PAGE_MASK);
>
> write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
You could skip this ^ write. Otherwise looks good to me.
Suzuki
> - set_trbe_enabled(buf->cpudata);
> + set_trbe_enabled(buf->cpudata, trblimitr);
> }
>
> static void trbe_enable_hw(struct trbe_buf *buf)
> @@ -1050,13 +1048,14 @@ static int arm_trbe_disable(struct coresight_device *csdev)
> static void trbe_handle_spurious(struct perf_output_handle *handle)
> {
> struct trbe_buf *buf = etm_perf_sink_config(handle);
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
>
> /*
> * If the IRQ was spurious, simply re-enable the TRBE
> * back without modifying the buffer parameters to
> * retain the trace collected so far.
> */
> - set_trbe_enabled(buf->cpudata);
> + set_trbe_enabled(buf->cpudata, trblimitr);
> }
>
> static int trbe_handle_overflow(struct perf_output_handle *handle)
>
>
>>
>>>
>>>> function could simply be :
>>>>
>>>> set_trbe_enabled(trbe_buf)
>>>> {
>>>> limitr = trbe_buf->limit | LIMITR_ENABLE
>>>> write(limitr, TRBLIMITR_EL1);
>>>> ...
>>>> }
>>>
>>> Is the potential for performance improvement here, out weigh possible
>>> risks of using buf->trbe_limit directly while enabling the TRBE ?
>>
>> I somehow don't like the fact that we have additional write and read
>> for the most common case of the TRBE usage (i.e, for arm_trbe_enable()).
>> If we could avoid that, that may be better.
>>
>> Cheers
>> Suzuki
More information about the linux-arm-kernel
mailing list