[PATCH v5 10/24] KVM: arm64: Set up FGT for Partitioned PMU

Colton Lewis coltonlewis at google.com
Fri Dec 12 12:51:34 PST 2025


Hi Oliver. Thanks for the speedy review.

Oliver Upton <oupton at kernel.org> writes:

> Hi Colton,

> On Tue, Dec 09, 2025 at 08:51:07PM +0000, Colton Lewis wrote:
>> In order to gain the best performance benefit from partitioning the
>> PMU, utilize fine grain traps (FEAT_FGT and FEAT_FGT2) to avoid
>> trapping common PMU register accesses by the guest to remove that
>> overhead.

>> Untrapped:
>> * PMCR_EL0
>> * PMUSERENR_EL0
>> * PMSELR_EL0
>> * PMCCNTR_EL0
>> * PMCNTEN_EL0
>> * PMINTEN_EL1
>> * PMEVCNTRn_EL0

>> These are safe to untrap because writing MDCR_EL2.HPMN as this series
>> will do limits the effect of writes to any of these registers to the
>> partition of counters 0..HPMN-1. Reads from these registers will not
>> leak information from between guests as all these registers are
>> context swapped by a later patch in this series. Reads from these
>> registers also do not leak any information about the host's hardware
>> beyond what is promised by PMUv3.

>> Trapped:
>> * PMOVS_EL0
>> * PMEVTYPERn_EL0
>> * PMCCFILTR_EL0
>> * PMICNTR_EL0
>> * PMICFILTR_EL0
>> * PMCEIDn_EL0
>> * PMMIR_EL1

>> PMOVS remains trapped so KVM can track overflow IRQs that will need to
>> be injected into the guest.

>> PMICNTR and PMIFILTR remain trapped because KVM is not handling them
>> yet.

>> PMEVTYPERn remains trapped so KVM can limit which events guests can
>> count, such as disallowing counting at EL2. PMCCFILTR and PMCIFILTR
>> are special cases of the same.

>> PMCEIDn and PMMIR remain trapped because they can leak information
>> specific to the host hardware implementation.

>> NOTE: This patch temporarily forces kvm_vcpu_pmu_is_partitioned() to
>> be false to prevent partial feature activation for easier debugging.

>> Signed-off-by: Colton Lewis <coltonlewis at google.com>
>> ---
>>   arch/arm64/include/asm/kvm_pmu.h | 33 ++++++++++++++++++++++
>>   arch/arm64/kvm/config.c          | 34 ++++++++++++++++++++--
>>   arch/arm64/kvm/pmu-direct.c      | 48 ++++++++++++++++++++++++++++++++
>>   3 files changed, 112 insertions(+), 3 deletions(-)

>> diff --git a/arch/arm64/include/asm/kvm_pmu.h  
>> b/arch/arm64/include/asm/kvm_pmu.h
>> index 8887f39c25e60..7297a697a4a62 100644
>> --- a/arch/arm64/include/asm/kvm_pmu.h
>> +++ b/arch/arm64/include/asm/kvm_pmu.h
>> @@ -96,6 +96,23 @@ u64 kvm_pmu_guest_counter_mask(struct arm_pmu *pmu);
>>   void kvm_pmu_host_counters_enable(void);
>>   void kvm_pmu_host_counters_disable(void);

>> +#if !defined(__KVM_NVHE_HYPERVISOR__)
>> +bool kvm_vcpu_pmu_is_partitioned(struct kvm_vcpu *vcpu);
>> +bool kvm_vcpu_pmu_use_fgt(struct kvm_vcpu *vcpu);
>> +#else
>> +static inline bool kvm_vcpu_pmu_is_partitioned(struct kvm_vcpu *vcpu)
>> +{
>> +	return false;
>> +}
>> +
>> +static inline bool kvm_vcpu_pmu_use_fgt(struct kvm_vcpu *vcpu)
>> +{
>> +	return false;
>> +}
>> +#endif
>> +u64 kvm_pmu_fgt_bits(void);
>> +u64 kvm_pmu_fgt2_bits(void);
>> +
>>   /*
>>    * Updates the vcpu's view of the pmu events for this cpu.
>>    * Must be called before every vcpu run after disabling interrupts, to  
>> ensure
>> @@ -135,6 +152,22 @@ static inline u64 kvm_pmu_get_counter_value(struct  
>> kvm_vcpu *vcpu,
>>   {
>>   	return 0;
>>   }
>> +static inline bool kvm_vcpu_pmu_is_partitioned(struct kvm_vcpu *vcpu)
>> +{
>> +	return false;
>> +}
>> +static inline bool kvm_vcpu_pmu_use_fgt(struct kvm_vcpu *vcpu)
>> +{
>> +	return false;
>> +}
>> +static inline u64 kvm_pmu_fgt_bits(void)
>> +{
>> +	return 0;
>> +}
>> +static inline u64 kvm_pmu_fgt2_bits(void)
>> +{
>> +	return 0;
>> +}
>>   static inline void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu,
>>   					     u64 select_idx, u64 val) {}
>>   static inline void kvm_pmu_set_counter_value_user(struct kvm_vcpu *vcpu,
>> diff --git a/arch/arm64/kvm/config.c b/arch/arm64/kvm/config.c
>> index 24bb3f36e9d59..064dc6aa06f76 100644
>> --- a/arch/arm64/kvm/config.c
>> +++ b/arch/arm64/kvm/config.c
>> @@ -6,6 +6,7 @@

>>   #include <linux/kvm_host.h>
>>   #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_pmu.h>
>>   #include <asm/kvm_nested.h>
>>   #include <asm/sysreg.h>

>> @@ -1489,12 +1490,39 @@ static void __compute_hfgwtr(struct kvm_vcpu  
>> *vcpu)
>>   		*vcpu_fgt(vcpu, HFGWTR_EL2) |= HFGWTR_EL2_TCR_EL1;
>>   }

>> +static void __compute_hdfgrtr(struct kvm_vcpu *vcpu)
>> +{
>> +	__compute_fgt(vcpu, HDFGRTR_EL2);
>> +
>> +	if (kvm_vcpu_pmu_use_fgt(vcpu))
>> +		*vcpu_fgt(vcpu, HDFGRTR_EL2) |= kvm_pmu_fgt_bits();

> Couple of suggestions. I'd rather see this conditioned on
> kvm_vcpu_pmu_is_partitioned() and get rid of the FGT predicate. After
> all, kvm_vcpu_load_fgt() already checks for the presence of FEAT_FGT
> first.

I can use that here. I'll still need the use_fgt predicate when
programming MDCR_EL2 so I'll introduce in the later patch.

> Additionally, I'd prefer that the trap configuration is inline instead
> of done in a helper in some other file. Centralizing the FGT
> configuration here was very much intentional.

Sure I can centralize it.

> The other reason for doing this is kvm_pmu_fgt_bits() assumes a
> 'positive' trap polarity, even though there are several cases where FGTs
> have a 'negative' priority (i.e. 0 => trap).

For the bits I was concerned with they all had positive polarity, except
for the dedicated instruction counter. (Side note: Why would ARM do
this?)

IIRC the FGT setup I plugged into in previous versions of the patch had
some icky macros that accounted for polarity. They were confusing and I
didn't like the effort to understand them.

Is there a good reason not to adopt a convetion that 1 => trap for
kernel code? Reversing the negative polarities immediately before write
could be easy: Have a bitmap of the negative polarity bits to xor with
the traps we actually want.

A^0 = A (positive polarity bits)
A^1 = ~A (negative polarity bits)

> Thanks,
> Oliver



More information about the linux-arm-kernel mailing list