[PATCH] KVM: arm64: pmu: Resync EL0 state on counter rotation

Shijie Huang shijie at amperemail.onmicrosoft.com
Tue Aug 15 00:24:18 PDT 2023


Hi Marc,

在 2023/8/15 14:27, Marc Zyngier 写道:
> On Mon, 14 Aug 2023 03:58:32 +0100,
> Shijie Huang <shijie at amperemail.onmicrosoft.com> wrote:
>> Hi Marc,
>>
>> 在 2023/8/12 2:05, Marc Zyngier 写道:
>>> Huang Shijie reports that, when profiling a guest from the host
>>> with a number of events that exceeds the number of available
>>> counters, the reported counts are wildly inaccurate. Without
>>> the counter oversubscription, the reported counts are correct.
>>>
>>> Their investigation indicates that upon counter rotation (which
>>> takes place on the back of a timer interrupt), we fail to
>>> re-apply the guest EL0 enabling, leading to the counting of host
>>> events instead of guest events.
>>>
>>> In order to solve this, add yet another hook between the host PMU
>>> driver and KVM, re-applying the guest EL0 configuration if the
>>> right conditions apply (the host is VHE, we are in interrupt
>>> context, and we interrupted a running vcpu). This triggers a new
>>> vcpu request which will apply the correct configuration on guest
>>> reentry.
>>>
>>> With this, we have the correct counts, even when the counters are
>>> oversubscribed.
>>>
>>> Reported-by: Huang Shijie <shijie at os.amperecomputing.com>
>>> Suggested-by: Oliver Upton <oliver.upton at linux.dev>
>>> Signed-off-by: Marc Zyngier <maz at kernel.org>
>>> Cc: Mark Rutland <mark.rutland at arm.com>
>>> Cc: Will Deacon <will at kernel.org>
>>> Link: https://lore.kernel.org/r/20230809013953.7692-1-shijie@os.amperecomputing.com
>>> ---
>>>    arch/arm64/include/asm/kvm_host.h |  1 +
>>>    arch/arm64/kvm/arm.c              |  3 +++
>>>    arch/arm64/kvm/pmu.c              | 18 ++++++++++++++++++
>>>    drivers/perf/arm_pmuv3.c          |  2 ++
>>>    include/kvm/arm_pmu.h             |  2 ++
>>>    5 files changed, 26 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>> index d3dd05bbfe23..553040e0e375 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -49,6 +49,7 @@
>>>    #define KVM_REQ_RELOAD_GICv4	KVM_ARCH_REQ(4)
>>>    #define KVM_REQ_RELOAD_PMU	KVM_ARCH_REQ(5)
>>>    #define KVM_REQ_SUSPEND		KVM_ARCH_REQ(6)
>>> +#define KVM_REQ_RESYNC_PMU_EL0	KVM_ARCH_REQ(7)
>>>      #define KVM_DIRTY_LOG_MANUAL_CAPS
>>> (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
>>>    				     KVM_DIRTY_LOG_INITIALLY_SET)
>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>> index 72dc53a75d1c..978b0411082f 100644
>>> --- a/arch/arm64/kvm/arm.c
>>> +++ b/arch/arm64/kvm/arm.c
>>> @@ -803,6 +803,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
>>>    			kvm_pmu_handle_pmcr(vcpu,
>>>    					    __vcpu_sys_reg(vcpu, PMCR_EL0));
>>>    +		if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
>>> +			kvm_vcpu_pmu_restore_guest(vcpu);
>>> +
>>>    		if (kvm_check_request(KVM_REQ_SUSPEND, vcpu))
>>>    			return kvm_vcpu_suspend(vcpu);
>>>    diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
>>> index 121f1a14c829..0eea225fd09a 100644
>>> --- a/arch/arm64/kvm/pmu.c
>>> +++ b/arch/arm64/kvm/pmu.c
>>> @@ -236,3 +236,21 @@ bool kvm_set_pmuserenr(u64 val)
>>>    	ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
>>>    	return true;
>>>    }
>>> +
>>> +/*
>>> + * If we interrupted the guest to update the host PMU context, make
>>> + * sure we re-apply the guest EL0 state.
>>> + */
>>> +void kvm_vcpu_pmu_resync_el0(void)
>>> +{
>>> +	struct kvm_vcpu *vcpu;
>>> +
>>> +	if (!has_vhe() || !in_interrupt())
>>> +		return;
>>> +
>>> +	vcpu = kvm_get_running_vcpu();
>>> +	if (!vcpu)
>>> +		return;
>>> +
>>> +	kvm_make_request(KVM_REQ_RESYNC_PMU_EL0, vcpu);
>>> +}
>>> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
>>> index 08b3a1bf0ef6..6a3d8176f54a 100644
>>> --- a/drivers/perf/arm_pmuv3.c
>>> +++ b/drivers/perf/arm_pmuv3.c
>>> @@ -772,6 +772,8 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>>>      	/* Enable all counters */
>>>    	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
>>> +
>>> +	kvm_vcpu_pmu_resync_el0();
>>>    }
>> I read the perf code again, and find it maybe not good to do it in
>> armv8pmu_start.
>>
>>     Assume we install a new perf event to a CPU "x" from CPU 0,  a VM
>> guest is running on CPU "x":
>>
>>      perf_event_open() --> perf_install_in_context() -->
>>
>>      call this function in  IPI interrupt: ___perf_install_in_context().
>>
>>     armv8pmu_start() will be called in ___perf_install_in_context() in IPI.
>>
>>     so kvm_vcpu_pmu_resync_el0() will _make_ a request by meeting the
>> conditions:
>>
>>               1.) in interrupt context.
>>
>>               2.) a guest is running on this CPU.
>>
>>
>> But in actually, the request should not send out.
> Why shouldn't it be applied? This isn't supposed to be always
> necessary, but it needs to be applied whenever there is a possibility
> for counters to be updated behind our back, something that is a pretty
> event anyway.

okay.


>> Please correct me if I am wrong.
>>
>> IMHO, the best solution is add  a hook in the perf/core code, and make
>> the request there.
> I disagree. I'm still completely opposed to anything that will add
> such a hook in the core perf code, specially as a weak symbol. The
> interactions must be strictly between the PMUv3 driver and KVM,
> because they are the only parties involved here.
>
> I will *not* take such a patch.

okay, please ignore my v3 patch.

Tested_by: Huang Shijie <shijie at os.amperecomputing.com>


Thanks

Huang Shijie

>
> 	M.
>



More information about the linux-arm-kernel mailing list