[PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
Leo Yan
leo.yan at linaro.org
Tue Aug 22 06:45:16 PDT 2023
Hi,
On Sun, Aug 20, 2023 at 10:01:08AM +0100, Marc Zyngier wrote:
> Huang Shijie reports that, when profiling a guest from the host
> with a number of events that exceeds the number of available
> counters, the reported counts are wildly inaccurate. Without
> the counter oversubscription, the reported counts are correct.
>
> Their investigation indicates that upon counter rotation (which
> takes place on the back of a timer interrupt), we fail to
> re-apply the guest EL0 enabling, leading to the counting of host
> events instead of guest events.
>
> In order to solve this, add yet another hook between the host PMU
> driver and KVM, re-applying the guest EL0 configuration if the
> right conditions apply (the host is VHE, we are in interrupt
> context, and we interrupted a running vcpu). This triggers a new
> vcpu request which will apply the correct configuration on guest
> reentry.
>
> With this, we have the correct counts, even when the counters are
> oversubscribed.
I gave a test for this patch, It works well.
However, I do see this patch can introduce huge amount invoking
kvm_vcpu_pmu_restore_guest() when using 'perf record' command.
As I mentioned in the patch v2, we can call kvm_vcpu_pmu_resync_el0()
in the function kvm_set_pmu_events() rather than in armv8pmu_start().
With this change, the kernel only syncs PMU context when the host and
the guest have different traceing for EL0. Just paste the suggested
code for reference:
@@ -46,6 +48,8 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
pmu->events_host |= set;
if (!attr->exclude_guest)
pmu->events_guest |= set;
+
+ kvm_vcpu_pmu_resync_el0();
}
Below is the comparison result for counting resync, the result is for
counting how many times kvm_vcpu_pmu_restore_guest() is called for
'perf stat' and 'perf record' commands.
| perf stat(*) | perf record(**)
-----------------+------------------+-----------------
Patch v3: | 2506 | 47325
Proposed change: | 2514 | 2504
(*): sudo ./perf stat -a -e cycles:G,cycles:H -d -d -d sleep 10
(**): sudo ./perf record -a -e cycles:G,cycles:H -d -d -d sleep 10
Thanks,
Leo
More information about the linux-arm-kernel
mailing list