[PATCH] clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
Daniel Lezcano
daniel.lezcano at free.fr
Tue Nov 22 00:11:13 PST 2022
On 21/11/2022 15:53, Marc Zyngier wrote:
> From: Joe Korty <joe.korty at concurrent-rt.com>
>
> The TVAL register is 32 bit signed. Thus only the lower 31 bits are
> available to specify when an interrupt is to occur at some time in the
> near future. Attempting to specify a larger interval with TVAL results
> in a negative time delta which means the timer fires immediately upon
> being programmed, rather than firing at that expected future time.
>
> The solution is for Linux to declare that TVAL is a 31 bit register rather
> than give its true size of 32 bits. This prevents Linux from programming
> TVAL with a too-large value. Note that, prior to 5.16, this little trick
> was the standard way to handle TVAL in Linux, so there is nothing new
> happening here on that front.
>
> The softlockup detector hides the issue, because it keeps generating
> short timer deadlines that are within the scope of the broken timer.
>
> Disable it, and you start using NO_HZ with much longer timer deadlines,
> which turns into an interrupt flood:
>
> 11: 1124855130 949168462 758009394 76417474 104782230 30210281
> 310890 1734323687 GICv2 29 Level arch_timer
>
> And "much longer" isn't that long: it takes less than 43s to underflow
> TVAL at 50MHz (the frequency of the counter on XGene-1).
>
> Some comments on the v1 version of this patch by Marc Zyngier:
>
> XGene implements CVAL (a 64bit comparator) in terms of TVAL (a countdown
> register) instead of the other way around. TVAL being a 32bit register,
> the width of the counter should equally be 32. However, TVAL is a
> *signed* value, and keeps counting down in the negative range once the
> timer fires.
>
> It means that any TVAL value with bit 31 set will fire immediately,
> as it cannot be distinguished from an already expired timer. Reducing
> the timer range back to a paltry 31 bits papers over the issue.
>
> Another problem cannot be fixed though, which is that the timer interrupt
> *must* be handled within the negative countdown period, or the interrupt
> will be lost (TVAL will rollover to a positive value, indicative of a
> new timer deadline).
>
> Cc: stable at vger.kernel.org # 5.16+
> Fixes: 012f18850452 ("clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations")
> Signed-off-by: Joe Korty <joe.korty at concurrent-rt.com>
> Reviewed-by: Marc Zyngier <maz at kernel.org>
> [maz: revamped the commit message]
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> Link: https://lore.kernel.org/r/20221024165422.GA51107@zipoli.concurrent-rt.com
> ---
Applied
More information about the linux-arm-kernel
mailing list