[PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation

Leo Yan leo.yan at linaro.org
Tue Aug 22 06:45:16 PDT 2023


Hi,

On Sun, Aug 20, 2023 at 10:01:08AM +0100, Marc Zyngier wrote:
> Huang Shijie reports that, when profiling a guest from the host
> with a number of events that exceeds the number of available
> counters, the reported counts are wildly inaccurate. Without
> the counter oversubscription, the reported counts are correct.
> 
> Their investigation indicates that upon counter rotation (which
> takes place on the back of a timer interrupt), we fail to
> re-apply the guest EL0 enabling, leading to the counting of host
> events instead of guest events.
> 
> In order to solve this, add yet another hook between the host PMU
> driver and KVM, re-applying the guest EL0 configuration if the
> right conditions apply (the host is VHE, we are in interrupt
> context, and we interrupted a running vcpu). This triggers a new
> vcpu request which will apply the correct configuration on guest
> reentry.
> 
> With this, we have the correct counts, even when the counters are
> oversubscribed.

I gave a test for this patch, It works well.

However, I do see this patch can introduce huge amount invoking
kvm_vcpu_pmu_restore_guest() when using 'perf record' command.

As I mentioned in the patch v2, we can call kvm_vcpu_pmu_resync_el0()
in the function kvm_set_pmu_events() rather than in armv8pmu_start().
With this change, the kernel only syncs PMU context when the host and
the guest have different traceing for EL0.  Just paste the suggested
code for reference:

@@ -46,6 +48,8 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
                pmu->events_host |= set;
        if (!attr->exclude_guest)
                pmu->events_guest |= set;
+
+       kvm_vcpu_pmu_resync_el0();
 }

Below is the comparison result for counting resync, the result is for
counting how many times kvm_vcpu_pmu_restore_guest() is called for
'perf stat' and 'perf record' commands.

                  |  perf stat(*)    |  perf record(**)
 -----------------+------------------+-----------------
 Patch v3:        |  2506            |  47325
 Proposed change: |  2514            |  2504

(*): sudo ./perf stat -a -e cycles:G,cycles:H -d -d -d sleep 10
(**): sudo ./perf record -a -e cycles:G,cycles:H -d -d -d sleep 10

Thanks,
Leo



More information about the linux-arm-kernel mailing list