[PATCH 0/9] KVM: arm64: PMU: Fixing chained events, and PMUv3p5 support

Marc Zyngier maz at kernel.org
Mon Oct 24 11:05:12 PDT 2022


Hi Ricardo,

On Fri, 12 Aug 2022 23:53:44 +0100,
Ricardo Koller <ricarkol at google.com> wrote:
> 
> On Thu, Aug 11, 2022 at 01:56:21PM +0100, Marc Zyngier wrote:
> > On Wed, 10 Aug 2022 22:55:03 +0100,
> > Ricardo Koller <ricarkol at google.com> wrote:
> > > 
> > > Just realized that KVM does not offer PMUv3p5 (with this series applied)
> > > when the real hardware is only Armv8.2 (the setup I originally tried).
> > > So, tried these other two setups on the fast model:
> > > 
> > > has_arm_v8-5=1
> > > 
> > > 	# ./lkvm-static run --nodefaults --pmu pmu.flat -p pmu-chained-sw-incr
> > > 	# lkvm run -k pmu.flat -m 704 -c 8 --name guest-135
> > > 
> > > 	INFO: PMU version: 0x6
> > >                            ^^^
> > >                            PMUv3 for Armv8.5
> > > 	INFO: PMU implementer/ID code: 0x41("A")/0
> > > 	INFO: Implements 8 event counters
> > > 	FAIL: pmu: pmu-chained-sw-incr: overflow and chain counter incremented after 100 SW_INCR/CHAIN
> > > 	INFO: pmu: pmu-chained-sw-incr: overflow=0x0, #0=4294967380 #1=0
> > >                                                  ^^^
> > >                                                  no overflows
> > > 	FAIL: pmu: pmu-chained-sw-incr: expected overflows and values after 100 SW_INCR/CHAIN
> > > 	INFO: pmu: pmu-chained-sw-incr: overflow=0x0, #0=84 #1=-1
> > > 	INFO: pmu: pmu-chained-sw-incr: overflow=0x0, #0=4294967380 #1=4294967295
> > > 	SUMMARY: 2 tests, 2 unexpected failures
> > 
> > Hmm. I think I see what's wrong. In kvm_pmu_create_perf_event(), we
> > have this:
> > 
> > 	if (kvm_pmu_idx_is_64bit(vcpu, select_idx))
> > 		attr.config1 |= 1;
> > 
> > 	counter = kvm_pmu_get_counter_value(vcpu, select_idx);
> > 
> > 	/* The initial sample period (overflow count) of an event. */
> > 	if (kvm_pmu_idx_has_64bit_overflow(vcpu, select_idx))
> > 		attr.sample_period = (-counter) & GENMASK(63, 0);
> > 	else
> > 		attr.sample_period = (-counter) & GENMASK(31, 0);
> > 
> > but the initial sampling period shouldn't be based on the *guest*
> > counter overflow. It really is about the getting to an overflow on the
> > *host*, so the initial code was correct, and only the width of the
> > counter matters here.
> 
> Right, I think this requires bringing back some of the chained related
> code (like update_pmc_chained() and pmc_is_chained()), because
> 
> 	attr.sample_period = (-counter) & GENMASK(31, 0);
> 
> should also be used when the counter is chained.

Almost, but not quite. I came up with the following hack (not
everything is relevant, but you'll get my drift):

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 9f29212e8fcd..6470a42e981d 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -450,6 +450,9 @@ static void kvm_pmu_counter_increment(struct kvm_vcpu *vcpu,
 			reg = lower_32_bits(reg);
 		__vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg;
 
+		if (!kvm_pmu_idx_has_64bit_overflow(vcpu, i))
+			reg = lower_32_bits(reg);
+
 		if (reg) /* No overflow? move on */
 			continue;
 
@@ -483,7 +486,7 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
 	 */
 	period = -(local64_read(&perf_event->count));
 
-	if (!kvm_pmu_idx_has_64bit_overflow(vcpu, pmc->idx))
+	if (!kvm_pmu_idx_is_64bit(vcpu, pmc->idx))
 		period &= GENMASK(31, 0);
 
 	local64_set(&perf_event->hw.period_left, 0);
@@ -605,17 +608,24 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx)
 	attr.exclude_host = 1; /* Don't count host events */
 	attr.config = eventsel;
 
-	/* If counting a 64bit event, advertise it to the perf code */
-	if (kvm_pmu_idx_is_64bit(vcpu, select_idx))
-		attr.config1 |= 1;
-
 	counter = kvm_pmu_get_counter_value(vcpu, select_idx);
 
-	/* The initial sample period (overflow count) of an event. */
-	if (kvm_pmu_idx_has_64bit_overflow(vcpu, select_idx))
-		attr.sample_period = (-counter) & GENMASK(63, 0);
-	else
+	/*
+	 * If counting with a 64bit counter, advertise it to the perf
+	 * code, carefully dealing with the initial sample period
+	 * which also depends on the overflow.
+	 */
+	if (kvm_pmu_idx_is_64bit(vcpu, select_idx)) {
+		attr.config1 |= 1;
+
+		if (!kvm_pmu_idx_has_64bit_overflow(vcpu, select_idx)) {
+			attr.sample_period = -(counter & GENMASK(31, 0));
+		} else {
+			attr.sample_period = (-counter) & GENMASK(63, 0);
+		}
+	} else {
 		attr.sample_period = (-counter) & GENMASK(31, 0);
+	}
 
 	event = perf_event_create_kernel_counter(&attr, -1, current,
 						 kvm_pmu_perf_overflow, pmc);


With this, I'm back in business (in QEMU, as I *still* cannot get ARM
to give me a model that runs natively on arm64...):

root at debian:~/kvm-unit-tests# ../kvmtool/lkvm run --nodefaults --pmu --firmware arm/pmu.flat -p pmu-chained-sw-incr
  # lkvm run --firmware arm/pmu.flat -m 448 -c 4 --name guest-400
  Info: Removed ghost socket file "/root/.lkvm//guest-400.sock".
WARNING: early print support may not work. Found uart at 0x1000000, but early base is 0x9000000.
INFO: PMU version: 0x6
INFO: PMU implementer/ID code: 0x41("A")/0x1
INFO: Implements 6 event counters
FAIL: pmu: pmu-chained-sw-incr: no overflow and chain counter incremented after 100 SW_INCR/CHAIN
INFO: pmu: pmu-chained-sw-incr: overflow=0x1, #0=4294967380 #1=1
FAIL: pmu: pmu-chained-sw-incr: overflow on chain counter and expected values after 100 SW_INCR/CHAIN
INFO: pmu: pmu-chained-sw-incr: overflow=0x3, #0=4294967380 #1=4294967296
SUMMARY: 2 tests, 2 unexpected failures

The tests themselves need some extra love to account for the fact that
the counters are always 64bit irrespective of the overflow, but at
least I'm now correctly seeing the odd counter incrementing.

I'll try to continue addressing the comments tomorrow.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list