[PATCH v6 3/4] arm64: arch_timer: Work around Erratum Hisilicon-161601

Ding Tianhong dingtianhong at huawei.com
Fri Jan 6 18:36:19 PST 2017



On 2017/1/6 22:49, Marc Zyngier wrote:
> On 05/01/17 05:31, Ding Tianhong wrote:
>> Erratum Hisilicon-161601 says that the ARM generic timer counter "has the
>> potential to contain an erroneous value when the timer value changes".
>> Accesses to TVAL (both read and write) are also affected due to the implicit counter
>> read.  Accesses to CVAL are not affected.
>>
>> The workaround is to reread the system count registers until the value of the second
>> read is larger than the first one by less than 32, the system counter can be guaranteed
>> not to return wrong value twice by back-to-back read and the error value is always larger
>> than the correct one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.
>>
>> The workaround is enabled if the hisilicon,erratum-161601 property is found in
>> the timer node in the device tree. This can be overridden with the
>> clocksource.arm_arch_timer.hisilicon-161601 boot parameter, which allows KVM
>> users to enable the workaround until a mechanism is implemented to
>> automatically communicate this information.
>>
>> Fix some description for fsl erratum a008585.
>>
>> v2: Significant rework based on feedback, including seperate the fsl erratum a008585
>>     to another patch, update the erratum name and remove unwanted code.
>>
>> v3: Significant rework based on feedback, including fix some alignment problem, make the
>>     #define __hisi_161601_read_reg to be private to the .c file instead of being globally
>>     visible, add more accurate annotation and modify a bit of logical format to enable
>>     arch_timer_read_ool_enabled, remove the kernel commandline parameter
>>     clocksource.arm_arch_timer.hisilicon-161601.
>>
>> v5: Theoretically the erratum should not occur more than twice in succession when reading
>>     the system counter, but it is possible that some interrupts may lead to more than twice
>>     read errors, triggering the warning, so setting the number of retries to 50 which is far
>>     beyond the number of iterations the loop has been observed to take.
>>
>> v6: Mark the hisi_161601_read_xxx_el0 with notrace, if CONFIG_FUNCTION_GRAPH_TRACER selected,
>>     hisi_161601_read_xxx_el0 will be related to ftrace_graph_caller which will calling sched_clock
>>     to read system counter again, it will cause the system stall into an endless loop.
> 
> Please move the changelog to the cover letter, as I don't need
> any of this to end-up in the commit log.
> 

OK.


>>
>> Signed-off-by: Ding Tianhong <dingtianhong at huawei.com>
>> ---
>>  Documentation/arm64/silicon-errata.txt |  1 +
>>  arch/arm64/include/asm/arch_timer.h    |  2 +-
>>  drivers/clocksource/Kconfig            |  9 +++++
>>  drivers/clocksource/arm_arch_timer.c   | 70 +++++++++++++++++++++++++++++++---
>>  4 files changed, 76 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
>> index 405da11..1c1a95f 100644
>> --- a/Documentation/arm64/silicon-errata.txt
>> +++ b/Documentation/arm64/silicon-errata.txt
>> @@ -63,3 +63,4 @@ stable kernels.
>>  | Cavium         | ThunderX SMMUv2 | #27704          | N/A		       |
>>  |                |                 |                 |                         |
>>  | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585     |
>> +| Hisilicon      | Hip0{5,6,7}     | #161601         | HISILICON_ERRATUM_161601|
>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
>> index f882c7c..ebf4cde 100644
>> --- a/arch/arm64/include/asm/arch_timer.h
>> +++ b/arch/arm64/include/asm/arch_timer.h
>> @@ -29,7 +29,7 @@
>>  
>>  #include <clocksource/arm_arch_timer.h>
>>  
>> -#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585)
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  extern struct static_key_false arch_timer_read_ool_enabled;
>>  #define needs_unstable_timer_counter_workaround() \
>>  	static_branch_unlikely(&arch_timer_read_ool_enabled)
>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
>> index 4866f7a..162d820 100644
>> --- a/drivers/clocksource/Kconfig
>> +++ b/drivers/clocksource/Kconfig
>> @@ -335,6 +335,15 @@ config FSL_ERRATUM_A008585
>>  	  value").  The workaround will only be active if the
>>  	  fsl,erratum-a008585 property is found in the timer node.
>>  
>> +config HISILICON_ERRATUM_161601
>> +	bool "Workaround for Hisilicon Erratum 161601"
>> +	default y
>> +	depends on ARM_ARCH_TIMER && ARM64
>> +	help
>> +	  This option enables a workaround for Hisilicon Erratum
>> +	  161601. The workaround will be active if the hisilicon,erratum-161601
>> +	  property is found in the timer node.
>> +
>>  config ARM_GLOBAL_TIMER
>>  	bool "Support for the ARM global timer" if COMPILE_TEST
>>  	select CLKSRC_OF if OF
>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
>> index f9097b6..078d38f 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -95,15 +95,18 @@ static int __init early_evtstrm_cfg(char *buf)
>>   * Architected system timer support.
>>   */
>>  
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if CONFIG_FSL_ERRATUM_A008585 || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
> 
> This line looks horrible. it should probably be IS_ENABLED(CONFIG_FSL).
> But more importantly, given that at least two independent SoC
> manufacturers have managed to screw their timers in a similar way,
> I'd expect that list to grow.
> 
> So please introduce a new config symbol (for example something like
> CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND) which gets selected by individual
> workarounds, and use that everywhere where you have both symbols.
> 

Fine, will introduce a new general config.

>>  struct arch_timer_erratum_workaround *timer_unstable_counter_workaround = NULL;
>>  EXPORT_SYMBOL_GPL(timer_unstable_counter_workaround);
>>  
>>  #define        FSL_A008585	0x0001
>> +#define        HISILICON_161601	0x0002
>>  
>>  DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled);
>>  EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled);
>> +#endif
>>  
>> +#ifdef CONFIG_FSL_ERRATUM_A008585
>>  /*
>>   * The number of retries is an arbitrary value well beyond the highest number
>>   * of iterations the loop has been observed to take.
>> @@ -145,6 +148,54 @@ static u64 notrace fsl_a008585_read_cntvct_el0(void)
>>  };
>>  #endif /* CONFIG_FSL_ERRATUM_A008585 */
>>  
>> +#ifdef CONFIG_HISILICON_ERRATUM_161601
>> +/*
>> + * Verify whether the value of the second read is larger than the first by
>> + * less than 32 is the only way to confirm the value is correct, so clear the
>> + * lower 5 bits to check whether the difference is greater than 32 or not.
>> + * Theoretically the erratum should not occur more than twice in succession
>> + * when reading the system counter, but it is possible that some interrupts
>> + * may lead to more than twice read errors, triggering the warning, so setting
>> + * the number of retries far beyond the number of iterations the loop has been
>> + * observed to take.
>> + */
>> +#define __hisi_161601_read_reg(reg) ({				\
>> +	u64 _old, _new;						\
>> +	int _retries = 50;					\
>> +								\
>> +	do {							\
>> +		_old = read_sysreg(reg);			\
>> +		_new = read_sysreg(reg);			\
>> +		_retries--;					\
>> +	} while (unlikely((_new - _old) >> 5) && _retries);	\
>> +								\
>> +	WARN_ON_ONCE(!_retries);				\
>> +	_new;							\
>> +})
>> +
>> +static u32 notrace hisi_161601_read_cntp_tval_el0(void)
>> +{
>> +	return __hisi_161601_read_reg(cntp_tval_el0);
>> +}
>> +
>> +static u32 notrace hisi_161601_read_cntv_tval_el0(void)
>> +{
>> +	return __hisi_161601_read_reg(cntv_tval_el0);
>> +}
>> +
>> +static u64 notrace hisi_161601_read_cntvct_el0(void)
>> +{
>> +	return __hisi_161601_read_reg(cntvct_el0);
>> +}
>> +
>> +static struct arch_timer_erratum_workaround arch_timer_hisi_161601 = {
>> +	.erratum = HISILICON_161601,
>> +	.read_cntp_tval_el0 = hisi_161601_read_cntp_tval_el0,
>> +	.read_cntv_tval_el0 = hisi_161601_read_cntv_tval_el0,
>> +	.read_cntvct_el0 = hisi_161601_read_cntvct_el0,
>> +};
>> +#endif /* CONFIG_HISILICON_ERRATUM_161601 */
>> +
>>  static __always_inline
>>  void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val,
>>  			  struct clock_event_device *clk)
>> @@ -294,7 +345,7 @@ static __always_inline void set_next_event(const int access, unsigned long evt,
>>  	arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
>>  }
>>  
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  static __always_inline void erratum_set_next_event_generic(const int access,
>>  		unsigned long evt, struct clock_event_device *clk)
>>  {
>> @@ -358,7 +409,7 @@ static int arch_timer_set_next_event_phys_mem(unsigned long evt,
>>  
>>  static void erratum_workaround_set_sne(struct clock_event_device *clk)
>>  {
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  	if (!static_branch_unlikely(&arch_timer_read_ool_enabled))
>>  		return;
>>  
>> @@ -618,7 +669,7 @@ static void __init arch_counter_register(unsigned type)
>>  
>>  		clocksource_counter.archdata.vdso_direct = true;
>>  
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  		/*
>>  		 * Don't use the vdso fastpath if errata require using
>>  		 * the out-of-line counter accessor.
>> @@ -909,10 +960,19 @@ static int __init arch_timer_of_init(struct device_node *np)
>>  #ifdef CONFIG_FSL_ERRATUM_A008585
>>  	if (!timer_unstable_counter_workaround && of_property_read_bool(np, "fsl,erratum-a008585"))
>>  		timer_unstable_counter_workaround = &arch_timer_fsl_a008585;
>> +#endif
>> +
>> +#ifdef CONFIG_HISILICON_ERRATUM_161601
>> +	if (!timer_unstable_counter_workaround && of_property_read_bool(np, "hisilicon,erratum-161601"))
>> +		timer_unstable_counter_workaround = &arch_timer_hisi_161601;
>> +#endif
>>  
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  	if (timer_unstable_counter_workaround) {
>>  		static_branch_enable(&arch_timer_read_ool_enabled);
>> -		pr_info("Enabling workaround for FSL erratum A-008585\n");
>> +		pr_info("Enabling workaround for %s\n",
>> +			timer_unstable_counter_workaround->erratum == FSL_A008585 ?
>> +			"FSL ERRATUM A-008585" : "HISILICON ERRATUM 161601");
>>  	}
>>  #endif
> 
> This looks extremely messy, and maybe it is time you properly refactor
> the whole thing so that we can scale. I came up with the following
> approach, which seems more manageable:
> 
> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
> index ebf4cde..1f89d94 100644
> --- a/arch/arm64/include/asm/arch_timer.h
> +++ b/arch/arm64/include/asm/arch_timer.h
> @@ -37,15 +37,14 @@ extern struct static_key_false arch_timer_read_ool_enabled;
>  #define needs_unstable_timer_counter_workaround()  false
>  #endif
>  
> -
>  struct arch_timer_erratum_workaround {
> -	int erratum;		/* Indicate the Erratum ID */
> +	const char *id;
>  	u32 (*read_cntp_tval_el0)(void);
>  	u32 (*read_cntv_tval_el0)(void);
>  	u64 (*read_cntvct_el0)(void);
>  };
>  
> -extern struct arch_timer_erratum_workaround *timer_unstable_counter_workaround;
> +extern const struct arch_timer_erratum_workaround *timer_unstable_counter_workaround;
>  
>  #define arch_timer_reg_read_stable(reg) 		\
>  ({							\
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index c069f1a..0dd80e6 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -96,12 +96,9 @@ early_param("clocksource.arm_arch_timer.evtstrm", early_evtstrm_cfg);
>   */
>  
>  #if CONFIG_FSL_ERRATUM_A008585 || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
> -struct arch_timer_erratum_workaround *timer_unstable_counter_workaround = NULL;
> +const struct arch_timer_erratum_workaround *timer_unstable_counter_workaround;
>  EXPORT_SYMBOL_GPL(timer_unstable_counter_workaround);
>  
> -#define        FSL_A008585	0x0001
> -#define        HISILICON_161601	0x0002
> -
>  DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled);
>  EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled);
>  #endif
> @@ -139,13 +136,6 @@ static u64 notrace fsl_a008585_read_cntvct_el0(void)
>  {
>  	return __fsl_a008585_read_reg(cntvct_el0);
>  }
> -
> -static struct arch_timer_erratum_workaround arch_timer_fsl_a008585 = {
> -	.erratum = FSL_A008585,
> -	.read_cntp_tval_el0 = fsl_a008585_read_cntp_tval_el0,
> -	.read_cntv_tval_el0 = fsl_a008585_read_cntv_tval_el0,
> -	.read_cntvct_el0 = fsl_a008585_read_cntvct_el0,
> -};
>  #endif /* CONFIG_FSL_ERRATUM_A008585 */
>  
>  #ifdef CONFIG_HISILICON_ERRATUM_161601
> @@ -187,13 +177,6 @@ static u64 notrace hisi_161601_read_cntvct_el0(void)
>  {
>  	return __hisi_161601_read_reg(cntvct_el0);
>  }
> -
> -static struct arch_timer_erratum_workaround arch_timer_hisi_161601 = {
> -	.erratum = HISILICON_161601,
> -	.read_cntp_tval_el0 = hisi_161601_read_cntp_tval_el0,
> -	.read_cntv_tval_el0 = hisi_161601_read_cntv_tval_el0,
> -	.read_cntvct_el0 = hisi_161601_read_cntvct_el0,
> -};
>  #endif /* CONFIG_HISILICON_ERRATUM_161601 */
>  
>  static __always_inline
> @@ -346,6 +329,25 @@ static __always_inline void set_next_event(const int access, unsigned long evt,
>  }
>  
>  #if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
> +static const struct arch_timer_erratum_workaround ool_workarounds[] = {
> +#ifdef CONFIG_FSL_ERRATUM_A008585
> +	{
> +		.id = "fsl,erratum-a008585",
> +		.read_cntp_tval_el0 = fsl_a008585_read_cntp_tval_el0,
> +		.read_cntv_tval_el0 = fsl_a008585_read_cntv_tval_el0,
> +		.read_cntvct_el0 = fsl_a008585_read_cntvct_el0,
> +	},
> +#endif
> +#ifdef CONFIG_HISILICON_ERRATUM_161601
> +	{
> +		.id = "hisilicon,erratum-161601",
> +		.read_cntp_tval_el0 = hisi_161601_read_cntp_tval_el0,
> +		.read_cntv_tval_el0 = hisi_161601_read_cntv_tval_el0,
> +		.read_cntvct_el0 = hisi_161601_read_cntvct_el0,
> +	},
> +#endif
> +};
> +
>  static __always_inline void erratum_set_next_event_generic(const int access,
>  		unsigned long evt, struct clock_event_device *clk)
>  {
> @@ -957,22 +959,15 @@ static int __init arch_timer_of_init(struct device_node *np)
>  
>  	arch_timer_c3stop = !of_property_read_bool(np, "always-on");
>  
> -#ifdef CONFIG_FSL_ERRATUM_A008585
> -	if (!timer_unstable_counter_workaround && of_property_read_bool(np, "fsl,erratum-a008585"))
> -		timer_unstable_counter_workaround = &arch_timer_fsl_a008585;
> -#endif
> -
> -#ifdef CONFIG_HISILICON_ERRATUM_161601
> -	if (!timer_unstable_counter_workaround && of_property_read_bool(np, "hisilicon,erratum-161601"))
> -		timer_unstable_counter_workaround = &arch_timer_hisi_161601;
> -#endif
> -
>  #if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
> -	if (timer_unstable_counter_workaround) {
> -		static_branch_enable(&arch_timer_read_ool_enabled);
> -		pr_info("Enabling workaround for %s\n",
> -			timer_unstable_counter_workaround->erratum == FSL_A008585 ?
> -			"FSL ERRATUM A-008585" : "HISILICON ERRATUM 161601");
> +	for (i = 0; i < ARRAY_SIZE(ool_workarounds); i++) {
> +		if (of_property_read_bool(np, ool_workarounds[i].id)) {
> +			timer_unstable_counter_workaround = &ool_workarounds[i];
> +			static_branch_enable(&arch_timer_read_ool_enabled);
> +			pr_info("Enabling workaround %s\n",
> +				timer_unstable_counter_workaround->id);
> +			break;
> +		}
>  	}
>  #endif
>  
this changes looks good to me, I will test it and release a new version, thanks a lot.

Ding
> 
> Thoughts?
> 
> 	M.
> 




More information about the linux-arm-kernel mailing list