[PATCH 0/9] KVM: arm64: PMU: Fixing chained events, and PMUv3p5 support
Marc Zyngier
maz at kernel.org
Thu Aug 11 05:56:21 PDT 2022
On Wed, 10 Aug 2022 22:55:03 +0100,
Ricardo Koller <ricarkol at google.com> wrote:
>
> On Wed, Aug 10, 2022 at 02:33:53PM -0500, Oliver Upton wrote:
> > Hi Ricardo,
> >
> > On Wed, Aug 10, 2022 at 11:46:22AM -0700, Ricardo Koller wrote:
> > > On Fri, Aug 05, 2022 at 02:58:04PM +0100, Marc Zyngier wrote:
> > > > Ricardo recently reported[1] that our PMU emulation was busted when it
> > > > comes to chained events, as we cannot expose the overflow on a 32bit
> > > > boundary (which the architecture requires).
> > > >
> > > > This series aims at fixing this (by deleting a lot of code), and as a
> > > > bonus adds support for PMUv3p5, as this requires us to fix a few more
> > > > things.
> > > >
> > > > Tested on A53 (PMUv3) and FVP (PMUv3p5).
> > > >
> > > > [1] https://lore.kernel.org/r/20220805004139.990531-1-ricarkol@google.com
> > > >
> > > > Marc Zyngier (9):
> > > > KVM: arm64: PMU: Align chained counter implementation with
> > > > architecture pseudocode
> > > > KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow
> > > > KVM: arm64: PMU: Only narrow counters that are not 64bit wide
> > > > KVM: arm64: PMU: Add counter_index_to_*reg() helpers
> > > > KVM: arm64: PMU: Simplify setting a counter to a specific value
> > > > KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation
> > > > KVM: arm64: PMU: Aleven ID_AA64DFR0_EL1.PMUver to be set from userspace
> > > > KVM: arm64: PMU: Implement PMUv3p5 long counter support
> > > > KVM: arm64: PMU: Aleven PMUv3p5 to be exposed to the guest
> > > >
> > > > arch/arm64/include/asm/kvm_host.h | 1 +
> > > > arch/arm64/kvm/arm.c | 6 +
> > > > arch/arm64/kvm/pmu-emul.c | 372 ++++++++++--------------------
> > > > arch/arm64/kvm/sys_regs.c | 65 +++++-
> > > > include/kvm/arm_pmu.h | 16 +-
> > > > 5 files changed, 208 insertions(+), 252 deletions(-)
> > > >
> > > > --
> > > > 2.34.1
> > > >
> > >
> > > Hi Marc,
> > >
> > > There is one extra potential issue with exposing PMUv3p5. I see this
> > > weird behavior when doing passthrough ("bare metal") on the fast-model
> > > configured to emulate PMUv3p5: the [63:32] half of the counters
> > > overflowing at 32-bits is still incremented.
> > >
> > > Fast model - ARMv8.5:
> > >
> > > Assuming the initial state is even=0xFFFFFFFF and odd=0x0,
> > > incrementing the even counter leads to:
> > >
> > > 0x00000001_00000000 0x00000000_00000001 0x1
> > > even counter odd counter PMOVSET
> > >
> > > Assuming the initial state is even=0xFFFFFFFF and odd=0xFFFFFFFF,
> > > incrementing the even counter leads to:
> > >
> > > 0x00000001_00000000 0x00000001_00000000 0x3
> > > even counter odd counter PMOVSET
> >
> > This is to be expected, actually. PMUv8p5 counters are always 64 bit,
> > regardless of the configured overflow.
> >
> > DDI 0487H D8.3 Behavior on overflow
> >
> > If FEAT_PMUv3p5 is implemented, 64-bit event counters are implemented,
> > HDCR.HPMN is not 0, and either n is in the range [0 .. (HDCR.HPMN-1)]
> > or EL2 is not implemented, then event counter overflow is configured
> > by PMCR.LP:
> >
> > — When PMCR.LP is set to 0, if incrementing PMEVCNTR<n> causes an unsigned
> > overflow of bits [31:0] of the event counter, the PE sets PMOVSCLR[n] to 1.
> > — When PMCR.LP is set to 1, if incrementing PMEVCNTR<n> causes an unsigned
> > overflow of bits [63:0] of the event counter, the PE sets PMOVSCLR[n] to 1.
> >
> > [...]
> >
> > For all 64-bit counters, incrementing the counter is the same whether an
> > unsigned overflow occurs at [31:0] or [63:0]. If the counter increments
> > for an event, bits [63:0] are always incremented.
> >
> > Do you see this same (expected) failure w/ Marc's series?
>
> I don't know, I'm hitting another bug it seems.
>
> Just realized that KVM does not offer PMUv3p5 (with this series applied)
> when the real hardware is only Armv8.2 (the setup I originally tried).
> So, tried these other two setups on the fast model:
>
> has_arm_v8-5=1
>
> # ./lkvm-static run --nodefaults --pmu pmu.flat -p pmu-chained-sw-incr
> # lkvm run -k pmu.flat -m 704 -c 8 --name guest-135
>
> INFO: PMU version: 0x6
> ^^^
> PMUv3 for Armv8.5
> INFO: PMU implementer/ID code: 0x41("A")/0
> INFO: Implements 8 event counters
> FAIL: pmu: pmu-chained-sw-incr: overflow and chain counter incremented after 100 SW_INCR/CHAIN
> INFO: pmu: pmu-chained-sw-incr: overflow=0x0, #0=4294967380 #1=0
> ^^^
> no overflows
> FAIL: pmu: pmu-chained-sw-incr: expected overflows and values after 100 SW_INCR/CHAIN
> INFO: pmu: pmu-chained-sw-incr: overflow=0x0, #0=84 #1=-1
> INFO: pmu: pmu-chained-sw-incr: overflow=0x0, #0=4294967380 #1=4294967295
> SUMMARY: 2 tests, 2 unexpected failures
Hmm. I think I see what's wrong. In kvm_pmu_create_perf_event(), we
have this:
if (kvm_pmu_idx_is_64bit(vcpu, select_idx))
attr.config1 |= 1;
counter = kvm_pmu_get_counter_value(vcpu, select_idx);
/* The initial sample period (overflow count) of an event. */
if (kvm_pmu_idx_has_64bit_overflow(vcpu, select_idx))
attr.sample_period = (-counter) & GENMASK(63, 0);
else
attr.sample_period = (-counter) & GENMASK(31, 0);
but the initial sampling period shouldn't be based on the *guest*
counter overflow. It really is about the getting to an overflow on the
*host*, so the initial code was correct, and only the width of the
counter matters here.
/me goes back to running the FVP...
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list