[PATCH 15/16] KVM: arm64: selftests: Augment existing timer test to handle variable offsets

Colton Lewis coltonlewis at google.com
Fri Mar 10 11:26:47 PST 2023


Marc Zyngier <maz at kernel.org> writes:

>> mvbbq9:/data/coltonlewis/ecv/arm64-obj/kselftest/kvm#
>> ./aarch64/arch_timer -O 0xffff
>> ==== Test Assertion Failure ====
>>    aarch64/arch_timer.c:239: false
>>    pid=48094 tid=48095 errno=4 - Interrupted system call
>>       1  0x4010fb: test_vcpu_run at arch_timer.c:239
>>       2  0x42a5bf: start_thread at pthread_create.o:0
>>       3  0x46845b: thread_start at clone.o:0
>>    Failed guest assert: xcnt >= cval at aarch64/arch_timer.c:151
>> values: 2500645901305, 2500645961845; 9939, vcpu 0; stage; 3; iter: 2

> The fun part is that you can see similar things without the series:

> ==== Test Assertion Failure ====
>    aarch64/arch_timer.c:239: false
>    pid=647 tid=651 errno=4 - Interrupted system call
>       1  0x00000000004026db: test_vcpu_run at arch_timer.c:239
>       2  0x00007fffb13cedd7: ?? ??:0
>       3  0x00007fffb1437e9b: ?? ??:0
>    Failed guest assert: config_iter + 1 == irq_iter at  
> aarch64/arch_timer.c:188
> values: 2, 3; 0, vcpu 3; stage; 4; iter: 3

> That's on a vanilla kernel (6.2-rc4) on an M1 with the test run
> without any argument in a loop. After a few iterations, it blows.

These things are different failures. The first I've only ever found when
setting the -O option. What command did you use to trigger the second if
there were any non-default options?

Another interesting finding is that I can't reproduce any problems using
ARM's emulated platform. There is a possibility these errors are
ultimately down to individual hardware quirks, but that's still worth
understanding since everyone uses hardware and not emulators.

> The problem is that I don't understand enough of the test to make a
> judgement call. I hardly get *what* it is testing. Do you?

My understanding is the test validates timer interrupts are occuring
when the ARM manual says they should. It sets a comparison value (cval)
at some point a few miliseconds into the future and waits for the
counter (xcnt) to be greater than or equal to the comparison value, at
which point an interrupt should fire.

The failure I posted occurs at a line that says

GUEST_ASSERT_3(xcnt >= cval, xcnt, cval, xcnt_diff_us);

The counter was less than the comparison value, which implies the
interrupt fired early. Do we care? I don't know. I think it's weird that
this occurs when I set a physical offset with -O and no other time.

I've also noticed that the greater the offset I set, the greater the
difference between xcnt and cval. I think the physical offset is not
being accounted for every place it should. At the very least, that
indicates change is required in the test.

The failure you posted occurs at a line that says

GUEST_ASSERT_2(config_iter + 1 == irq_iter,
		config_iter + 1, irq_iter);

I gather from context that the values were unequal because an expected
interrupt never fired or was not counted. Do we care? I don't know. I
think someone should.



More information about the linux-arm-kernel mailing list