[PATCH v5 17/24] KVM: arm64: Context swap Partitioned PMU guest registers
Colton Lewis
coltonlewis at google.com
Fri Dec 12 13:57:29 PST 2025
Oliver Upton <oupton at kernel.org> writes:
> On Tue, Dec 09, 2025 at 08:51:14PM +0000, Colton Lewis wrote:
>> +/**
>> + * kvm_pmu_load() - Load untrapped PMU registers
>> + * @vcpu: Pointer to struct kvm_vcpu
>> + *
>> + * Load all untrapped PMU registers from the VCPU into the PCPU. Mask
>> + * to only bits belonging to guest-reserved counters and leave
>> + * host-reserved counters alone in bitmask registers.
>> + */
>> +void kvm_pmu_load(struct kvm_vcpu *vcpu)
>> +{
>> + struct arm_pmu *pmu;
>> + u64 mask;
>> + u8 i;
>> + u64 val;
>> +
> Assert that preemption is disabled.
Will do.
>> + /*
>> + * If we aren't using FGT then we are trapping everything
>> + * anyway, so no need to bother with the swap.
>> + */
>> + if (!kvm_vcpu_pmu_use_fgt(vcpu))
>> + return;
> Uhh... Then how do events count in this case?
> The absence of FEAT_FGT shouldn't affect the residence of the guest PMU
> context. We just need to handle the extra traps, ideally by reading the
> PMU registers directly as a fast path exit handler.
Agreed. Yeah I fixed this in my internal backports but looks like I
skipped incorperating the fix here.
>> + pmu = vcpu->kvm->arch.arm_pmu;
>> +
>> + for (i = 0; i < pmu->hpmn_max; i++) {
>> + val = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i);
>> + write_pmevcntrn(i, val);
>> + }
>> +
>> + val = __vcpu_sys_reg(vcpu, PMCCNTR_EL0);
>> + write_pmccntr(val);
>> +
>> + val = __vcpu_sys_reg(vcpu, PMUSERENR_EL0);
>> + write_pmuserenr(val);
> What about the host's value for PMUSERENR?
>> + val = __vcpu_sys_reg(vcpu, PMSELR_EL0);
>> + write_pmselr(val);
> PMSELR_EL0 needs to be switched late, e.g. at
> sysreg_restore_guest_state_vhe().
> Even though the host doesn't currently use the selector-based accessor,
> I'd prefer we not load things that'd affect the host context until we're
> about to enter the guest.
There's a spot in __activate_traps_common() where the host value for
PMUSERENR is saved and PMSELR is zeroed. I stopped that branch when
partitioning because it was clobbering my loaded values, but I can
modify instead to handle these things as they should be handled.
>> + /* Save only the stateful writable bits. */
>> + val = __vcpu_sys_reg(vcpu, PMCR_EL0);
>> + mask = ARMV8_PMU_PMCR_MASK &
>> + ~(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C);
>> + write_pmcr(val & mask);
>> +
>> + /*
>> + * When handling these:
>> + * 1. Apply only the bits for guest counters (indicated by mask)
>> + * 2. Use the different registers for set and clear
>> + */
>> + mask = kvm_pmu_guest_counter_mask(pmu);
>> +
>> + val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
>> + write_pmcntenset(val & mask);
>> + write_pmcntenclr(~val & mask);
>> +
>> + val = __vcpu_sys_reg(vcpu, PMINTENSET_EL1);
>> + write_pmintenset(val & mask);
>> + write_pmintenclr(~val & mask);
> Is this safe? What happens if we put the PMU into an overflow condition?
It gets handled by the host same as any other PMU interrupt. Though I
remember from our conversation you don't want the latency of an
additional interrupt so I can handle that here.
>> +}
>> +
>> +/**
>> + * kvm_pmu_put() - Put untrapped PMU registers
>> + * @vcpu: Pointer to struct kvm_vcpu
>> + *
>> + * Put all untrapped PMU registers from the VCPU into the PCPU. Mask
>> + * to only bits belonging to guest-reserved counters and leave
>> + * host-reserved counters alone in bitmask registers.
>> + */
>> +void kvm_pmu_put(struct kvm_vcpu *vcpu)
>> +{
>> + struct arm_pmu *pmu;
>> + u64 mask;
>> + u8 i;
>> + u64 val;
>> +
>> + /*
>> + * If we aren't using FGT then we are trapping everything
>> + * anyway, so no need to bother with the swap.
>> + */
>> + if (!kvm_vcpu_pmu_use_fgt(vcpu))
>> + return;
>> +
>> + pmu = vcpu->kvm->arch.arm_pmu;
>> +
>> + for (i = 0; i < pmu->hpmn_max; i++) {
>> + val = read_pmevcntrn(i);
>> + __vcpu_assign_sys_reg(vcpu, PMEVCNTR0_EL0 + i, val);
>> + }
>> +
>> + val = read_pmccntr();
>> + __vcpu_assign_sys_reg(vcpu, PMCCNTR_EL0, val);
>> +
>> + val = read_pmuserenr();
>> + __vcpu_assign_sys_reg(vcpu, PMUSERENR_EL0, val);
>> +
>> + val = read_pmselr();
>> + __vcpu_assign_sys_reg(vcpu, PMSELR_EL0, val);
>> +
>> + val = read_pmcr();
>> + __vcpu_assign_sys_reg(vcpu, PMCR_EL0, val);
>> +
>> + /* Mask these to only save the guest relevant bits. */
>> + mask = kvm_pmu_guest_counter_mask(pmu);
>> +
>> + val = read_pmcntenset();
>> + __vcpu_assign_sys_reg(vcpu, PMCNTENSET_EL0, val & mask);
>> +
>> + val = read_pmintenset();
>> + __vcpu_assign_sys_reg(vcpu, PMINTENSET_EL1, val & mask);
> What if the PMU is in an overflow state at this point?
Is this a separate concern from the point above? It gets loaded back
that way and the normal interrupt machinery handles it.
More information about the linux-arm-kernel
mailing list