arch_timer_edge_cases failures on ampere-one

Marc Zyngier maz at kernel.org
Thu Apr 10 08:35:27 PDT 2025


On Thu, 10 Apr 2025 16:10:43 +0100,
Sebastian Ott <sebott at redhat.com> wrote:
> 
> Hey,
> 
> I'm seeing consistent failures for the arch_timer_edge_cases
> selftest one ampere-one(x):
> ==== Test Assertion Failure ====
>   arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
>   pid=6277 tid=6277 errno=4 - Interrupted system call
>      1  0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
>      2  0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
>      3  0x0000ffffa8b2625b: ?? ??:0
>      4  0x0000ffffa8b2633b: ?? ??:0
>      5  0x000000000040202f: _start at ??:?
>   0x1 != 0x0 (timer_condition != istatus)
> 
> The (first) test that's failing is from test_timers_in_the_past():
>     /* Set a timer to counter=0 (in the past) */
>     test_timer_cval(timer, 0, wm, true, DEF_CNT);
> 
> If I understand this correctly then the timer condition is met, an
> irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.
> 
> What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
> set). KVM also reads 1 in timer_save_state() via
> read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?

My hunch is that this is related to AC03_CPU_14 in [1] (now archived
locally for future reference...).

> 
> These machines have FEAT_ECV (as a test I disabled that in the kernel
> but with the same result).
> 
> As a hack I set ARCH_TIMER_CTRL_IT_STAT in timer_save_state() when
> the timer condition is met and set up traps for the register - this
> lets the testcase succeed.
> 
> All with the current upstream kernel - but this is not new, I saw
> this a couple of months ago but lost access to the machine before
> I could debug..
> 
> Any hints what to do here?

Not a lot to do, assuming this is the actual cause. Similar things
happen on my QC box, which has a "remarkable" timer implementation and
a bunch of terrible hacks to keep it alive.

On the other hand, I'm not even convinced that this test-case is
legit. It seems to rely on the counter being 64bit wide, which is not
always the case.

	M.

[1] https://amperecomputing.com/assets/AmpereOne_Developer_ER_v0_80_20240823_28945022f4.pdf

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list