[PATCH v5 2/5] KVM: x86: Provide a capability to disable APERF/MPERF read intercepts

Xiaoyao Li xiaoyao.li at intel.com
Thu Jun 26 01:59:00 PDT 2025


On 6/26/2025 8:12 AM, Sean Christopherson wrote:
> From: Jim Mattson <jmattson at google.com>
> 
> Allow a guest to read the physical IA32_APERF and IA32_MPERF MSRs
> without interception.
> 
> The IA32_APERF and IA32_MPERF MSRs are not virtualized. Writes are not
> handled at all. The MSR values are not zeroed on vCPU creation, saved
> on suspend, or restored on resume. No accommodation is made for
> processor migration or for sharing a logical processor with other
> tasks. No adjustments are made for non-unit TSC multipliers. The MSRs
> do not account for time the same way as the comparable PMU events,
> whether the PMU is virtualized by the traditional emulation method or
> the new mediated pass-through approach.
> 
> Nonetheless, in a properly constrained environment, this capability
> can be combined with a guest CPUID table that advertises support for
> CPUID.6:ECX.APERFMPERF[bit 0] to induce a Linux guest to report the
> effective physical CPU frequency in /proc/cpuinfo. Moreover, there is
> no performance cost for this capability.
> 
> Signed-off-by: Jim Mattson <jmattson at google.com>
> Link: https://lore.kernel.org/r/20250530185239.2335185-3-jmattson@google.com
> Signed-off-by: Sean Christopherson <seanjc at google.com>
> ---
>   Documentation/virt/kvm/api.rst | 23 +++++++++++++++++++++++
>   arch/x86/kvm/svm/nested.c      |  4 +++-
>   arch/x86/kvm/svm/svm.c         |  5 +++++
>   arch/x86/kvm/vmx/nested.c      |  6 ++++++
>   arch/x86/kvm/vmx/vmx.c         |  4 ++++
>   arch/x86/kvm/x86.c             |  6 +++++-
>   arch/x86/kvm/x86.h             |  5 +++++
>   include/uapi/linux/kvm.h       |  1 +
>   tools/include/uapi/linux/kvm.h |  1 +
>   9 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 43ed57e048a8..27ced3ee2b53 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7844,6 +7844,7 @@ Valid bits in args[0] are::
>     #define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
>     #define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
>     #define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
> +  #define KVM_X86_DISABLE_EXITS_APERFMPERF       (1 << 4)
>   
>   Enabling this capability on a VM provides userspace with a way to no
>   longer intercept some instructions for improved latency in some
> @@ -7854,6 +7855,28 @@ all such vmexits.
>   
>   Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
>   
> +Virtualizing the ``IA32_APERF`` and ``IA32_MPERF`` MSRs requires more
> +than just disabling APERF/MPERF exits. While both Intel and AMD
> +document strict usage conditions for these MSRs--emphasizing that only
> +the ratio of their deltas over a time interval (T0 to T1) is
> +architecturally defined--simply passing through the MSRs can still
> +produce an incorrect ratio.
> +
> +This erroneous ratio can occur if, between T0 and T1:
> +
> +1. The vCPU thread migrates between logical processors.
> +2. Live migration or suspend/resume operations take place.
> +3. Another task shares the vCPU's logical processor.
> +4. C-states lower thean C0 are emulated (e.g., via HLT interception).

s/thean/than/

Reviewed-by: Xiaoyao Li <xiaoyao.li at intel.com>



More information about the linux-arm-kernel mailing list