[PATCH v6 3/4] arm64: arch_timer: Work around QorIQ Erratum A-008585

Marc Zyngier marc.zyngier at arm.com
Fri Sep 23 06:17:30 PDT 2016


On 22/09/16 09:35, Scott Wood wrote:
> Erratum A-008585 says that the ARM generic timer counter "has the
> potential to contain an erroneous value for a small number of core
> clock cycles every time the timer value changes".  Accesses to TVAL
> (both read and write) are also affected due to the implicit counter
> read.  Accesses to CVAL are not affected.
> 
> The workaround is to reread TVAL and count registers until successive
> reads return the same value.  Writes to TVAL are replaced with an
> equivalent write to CVAL.
> 
> The workaround is to reread TVAL and count registers until successive reads
> return the same value, and when writing TVAL to retry until counter
> reads before and after the write return the same value.
> 
> The workaround is enabled if the fsl,erratum-a008585 property is found in
> the timer node in the device tree.  This can be overridden with the
> clocksource.arm_arch_timer.fsl-a008585 boot parameter, which allows KVM
> users to enable the workaround until a mechanism is implemented to
> automatically communicate this information.
> 
> This erratum can be found on LS1043A and LS2080A.
> 
> Signed-off-by: Scott Wood <oss at buserror.net>
> ---
> v6:
> - Addressed feedback from Mark Rutland
> 
> v5:
> - Export arch_timer_read_ool_enabled so that get_cycles() can be called
>   from modules.
> 
> v4:
> - Undef ARCH_TIMER_REG_READ after use
> 
> v3:
> - Used cval rather than a loop for the write side of the erratum
> - Added a Kconfig control
> - Moved the device tree binding into its own patch
> - Added erratum to silicon-errata.txt
> - Changed function names to contain the erratum name
> - Factored out the setting of erratum versions of set_next_event
>   to improve readability
> - Added a comment clarifying that the timeout is arbitrary
> 
> v2:
> Significant rework based on feedback, including using static_key,
> disabling VDSO counter access rather than adding the workaround to the
> VDSO, and uninlining the loops.
> 
> Dropped the separate property for indicating that writes to TVAL are
> affected, as I believe that's just a side effect of the implicit
> counter read being corrupted, and thus a chip that is affected by one
> will always be affected by the other.
> 
> Dropped the arm32 portion as it seems there was confusion about whether
> LS1021A is affected.  Currently I am being told that it is not
> affected.
> 
> I considered writing to CVAL rather than looping on TVAL writes, but
> that would still have required separate set_next_event() code for the
> erratum, and adding CVAL to the enum would have required a bunch of
> extra handlers in switch statements (even where unused, due to compiler
> warnings about unhandled enum values) including in an arm32 header.  It
> seemed better to avoid the arm32 interaction and new untested
> accessors.
> 
> Signed-off-by: Scott Wood <oss at buserror.net>
> ---
>  Documentation/arm64/silicon-errata.txt |   2 +
>  Documentation/kernel-parameters.txt    |   9 +++
>  arch/arm64/include/asm/arch_timer.h    |  47 ++++++++++++++-
>  drivers/clocksource/Kconfig            |  10 ++++
>  drivers/clocksource/arm_arch_timer.c   | 104 +++++++++++++++++++++++++++++++++
>  5 files changed, 169 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
> index 4da60b4..041e3a9 100644
> --- a/Documentation/arm64/silicon-errata.txt
> +++ b/Documentation/arm64/silicon-errata.txt
> @@ -60,3 +60,5 @@ stable kernels.
>  | Cavium         | ThunderX GICv3  | #23154          | CAVIUM_ERRATUM_23154    |
>  | Cavium         | ThunderX Core   | #27456          | CAVIUM_ERRATUM_27456    |
>  | Cavium         | ThunderX SMMUv2 | #27704          | N/A		       |
> +|                |                 |                 |                         |
> +| Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585     |
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 46c030a..fb4de4d 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -698,6 +698,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  			loops can be debugged more effectively on production
>  			systems.
>  
> +	clocksource.arm_arch_timer.fsl-a008585=
> +			[ARM64]
> +			Format: <bool>
> +			Enable/disable the workaround of Freescale/NXP
> +			erratum A-008585.  This can be useful for KVM
> +			guests, if the guest device tree doesn't show the
> +			erratum.  If unspecified, the workaround is
> +			enabled based on the device tree.
> +
>  	clearcpuid=BITNUM [X86]
>  			Disable CPUID feature X for the kernel. See
>  			arch/x86/include/asm/cpufeatures.h for the valid bit
> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
> index 7ff386c..cddd5b7 100644
> --- a/arch/arm64/include/asm/arch_timer.h
> +++ b/arch/arm64/include/asm/arch_timer.h
> @@ -24,10 +24,51 @@
>  
>  #include <linux/bug.h>
>  #include <linux/init.h>
> +#include <linux/jump_label.h>
>  #include <linux/types.h>
>  
>  #include <clocksource/arm_arch_timer.h>
>  
> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585)
> +extern struct static_key_false arch_timer_read_ool_enabled;
> +#define needs_fsl_a008585_workaround() \
> +	static_branch_unlikely(&arch_timer_read_ool_enabled)
> +#else
> +#define needs_fsl_a008585_workaround()  false
> +#endif
> +
> +u32 __fsl_a008585_read_cntp_tval_el0(void);
> +u32 __fsl_a008585_read_cntv_tval_el0(void);
> +u64 __fsl_a008585_read_cntvct_el0(void);
> +
> +/*
> + * The number of retries is an arbitrary value well beyond the highest number
> + * of iterations the loop has been observed to take.
> + */
> +#define __fsl_a008585_read_reg(reg) ({			\
> +	u64 _old, _new;					\
> +	int _retries = 200;				\
> +							\
> +	do {						\
> +		_old = read_sysreg(reg);		\
> +		_new = read_sysreg(reg);		\
> +		_retries--;				\
> +	} while (unlikely(_old != _new) && _retries);	\
> +							\
> +	WARN_ON_ONCE(!_retries);			\
> +	_new;						\
> +})
> +
> +#define arch_timer_unstable_reg_read(reg) 		\

I think this name is the only thing I don't like about this patch,
because it is only unstable if the workaround is on. This is a minor
thing and it can be addressed when/after merging it. No need to respin
it on this account.

Acked-by: Marc Zyngier <marc.zyngier at arm.com>

The usual question is "Who takes it?". I'm quite keen on it, as my
LS2085 is otherwise completely unusable.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...



More information about the linux-arm-kernel mailing list