[PATCH v2 0/3] KVM: arm64: selftests: arch_timer_edge_cases fixes

Sebastian Ott sebott at redhat.com
Wed Jun 4 14:17:15 PDT 2025


On Wed, 4 Jun 2025, Sebastian Ott wrote:
> On Tue, 3 Jun 2025, Zenghui Yu wrote:
>>  On 2025/5/27 22:24, Sebastian Ott wrote:
>>>  Some small fixes for arch_timer_edge_cases that I stumbled upon
>>>  while debugging failures for this selftest on ampere-one.
>>>
>>>  Changes since v1: modified patch 3 based on suggestions from Marc.
>>>
>>>  I've done some tests with this on various machines - seems to be all
>>>  good, however on ampere-one I now hit this in 10% of the runs:
>>>  ==== Test Assertion Failure ====
>>>    arm64/arch_timer_edge_cases.c:481: timer_get_cntct(timer) >= DEF_CNT +
>>>    (timer_get_cntfrq() * (uint64_t)(delta_2_ms) / 1000)
>>>    pid=166657 tid=166657 errno=4 - Interrupted system call
>>>       1  0x0000000000404db3: test_run at arch_timer_edge_cases.c:933
>>>       2  0x0000000000401f9f: main at arch_timer_edge_cases.c:1062
>>>       3  0x0000ffffaedd625b: ?? ??:0
>>>       4  0x0000ffffaedd633b: ?? ??:0
>>>       5  0x00000000004020af: _start at ??:?
>>>    timer_get_cntct(timer) >= DEF_CNT + msec_to_cycles(delta_2_ms)
>>>
>>>  This is not new, it was just hidden behind the other failure. I'll
>>>  try to figure out what this is about (seems to be independent of
>>>  the wait time)..
>>
>>  Not sure if you have figured it out. I can easily reproduce it on my box
>>  and I *guess* it is that we have some random XVAL values when we enable
>>  the timer..
>
> Yes, I think so, too.
>
>>  test_reprogramming_timer()
>>  {
>>   local_irq_disable();
>>   reset_timer_state(timer, DEF_CNT);
>
> My first attempt was to also initialize cval here

Forgot to mention that I did this because my tests have shown
that the interrupt didn't only trigger early (like before the
reprogrammed delta) but instantly. This seemed to work but I think
the order in set_tval_irq() is the actual issue.


>
>>
>>   /* Program the timer to DEF_CNT + delta_1_ms. */
>>   set_tval_irq(timer, msec_to_cycles(delta_1_ms), CTL_ENABLE);
>>
>>  	[...]
>>  }
>>
>>  set_tval_irq()
>>  {
>>   timer_set_ctl(timer, ctl);
>>
>>   // There is a window that we enable the timer with *random* XVAL
>>   // values and we may get the unexpected interrupt.. And it's
>>   // unlikely that KVM can be aware of TVAL's change (and
>>   // re-evaluate the interrupt's pending state) before hitting the
>>   // GUEST_ASSERT().
>>
>>   timer_set_tval(timer, tval_cycles);
>
> Yes, I stumbled over this as well. I've always assumed that this order is
> becauase of this from the architecture "If CNTV_CTL_EL0.ENABLE is 0, the 
> value returned is UNKNOWN." However re-reading that part today I realized
> that this only concerns register reads.
>
> Maybe somone on cc knows why it's in that order?
>
> I'm currently testing this with the above swapped and it's looking good,
> so far.
>
>>  }
>>
>>  I'm not familiar with the test so I'm not 100% sure that this is the
>>  root cause. But I hope this helps with your analysis ;-) .
>
> It did, thanks!
>
> Sebastian
>




More information about the linux-arm-kernel mailing list