[PATCH v4] perf/arm_pmu: Skip PMCCNTR_EL0 on NVIDIA Olympus
James Clark
james.clark at linaro.org
Tue May 5 01:17:58 PDT 2026
On 04/05/2026 6:52 pm, Besar Wicaksono wrote:
> PMCCNTR_EL0 may continue to increment on NVIDIA Olympus CPUs while the
> PE is in WFI/WFE. That does not necessarily match the CPU_CYCLES event
> counted by a programmable counter, so using PMCCNTR_EL0 for cycles can
> give results that differ from the programmable counter path.
>
> Extend the existing PMCCNTR avoidance decision from the SMT case to
> also cover Olympus. Store the result in the common arm_pmu state at
> registration time, so arm_pmuv3 can keep using a single flag when
> deciding whether CPU_CYCLES may use PMCCNTR_EL0.
>
> Signed-off-by: Besar Wicaksono <bwicaksono at nvidia.com>
> ---
>
> Changes from v1:
> * add CONFIG_ARM64 check to fix build error found by kernel test robot
> * add explicit include of <asm/cputype.h>
> v1: https://lore.kernel.org/linux-arm-kernel/20260406232034.2566133-1-bwicaksono@nvidia.com/
>
> Changes from v2:
> * Move the Olympus PMCCNTR avoidance check from arm_pmuv3.c to the
> common arm_pmu registration path.
> * Replace the PMUv3-only has_smt flag with avoid_pmccntr, covering both
> the existing SMT restriction and the Olympus MIDR restriction.
> * Use the cached per-CPU MIDR from cpu_data instead of calling
> is_midr_in_range_list() from armv8pmu_can_use_pmccntr().
> * Add the required asm/cpu.h include for cpu_data.
> v2: https://lore.kernel.org/linux-arm-kernel/20260421203856.3539186-1-bwicaksono@nvidia.com/#t
>
> Changes from v3:
> * Move avoidance check based on MIDR to __armv8pmu_probe_pmu() to make sure
> the MIDR is retrieved from the correct online CPU.
> v3: https://lore.kernel.org/linux-arm-kernel/20260429215614.1793131-1-bwicaksono@nvidia.com/
>
> ---
> drivers/perf/arm_pmu.c | 7 ++++-
> drivers/perf/arm_pmuv3.c | 51 +++++++++++++++++++++++++++++++-----
> include/linux/perf/arm_pmu.h | 2 +-
> 3 files changed, 51 insertions(+), 9 deletions(-)
>
Reviewed-by: James Clark <james.clark at linaro.org>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 939bcbd433aa..aa1dac0b440f 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -931,8 +931,13 @@ int armpmu_register(struct arm_pmu *pmu)
> /*
> * By this stage we know our supported CPUs on either DT/ACPI platforms,
> * detect the SMT implementation.
> + * On SMT CPUs, the PMCCNTR_EL0 increments from the processor clock rather
> + * than the PE clock (ARM DDI0487 L.b D13.1.3) which means it'll continue
> + * counting on a WFI PE if one of its SMT sibling is not idle on a
> + * multi-threaded implementation. So don't use it on SMT cores.
> */
> - pmu->has_smt = topology_core_has_smt(cpumask_first(&pmu->supported_cpus));
> + pmu->avoid_pmccntr |=
> + topology_core_has_smt(cpumask_first(&pmu->supported_cpus));
>
> if (!pmu->set_event_filter)
> pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 8014ff766cff..1ee4a09d0dcc 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -8,6 +8,7 @@
> * This code is based heavily on the ARMv7 perf event code.
> */
>
> +#include <asm/cputype.h>
> #include <asm/irq_regs.h>
> #include <asm/perf_event.h>
> #include <asm/virt.h>
> @@ -1002,13 +1003,7 @@ static bool armv8pmu_can_use_pmccntr(struct pmu_hw_events *cpuc,
> if (has_branch_stack(event))
> return false;
>
> - /*
> - * The PMCCNTR_EL0 increments from the processor clock rather than
> - * the PE clock (ARM DDI0487 L.b D13.1.3) which means it'll continue
> - * counting on a WFI PE if one of its SMT sibling is not idle on a
> - * multi-threaded implementation. So don't use it on SMT cores.
> - */
> - if (cpu_pmu->has_smt)
> + if (cpu_pmu->avoid_pmccntr)
> return false;
>
> return true;
> @@ -1299,6 +1294,41 @@ static int armv8_vulcan_map_event(struct perf_event *event)
> &armv8_vulcan_perf_cache_map);
> }
>
> +#ifdef CONFIG_ARM64
> +/*
> + * List of CPUs that should avoid using PMCCNTR_EL0.
> + */
> +static struct midr_range armv8pmu_avoid_pmccntr_cpus[] = {
> + /*
> + * The PMCCNTR_EL0 in Olympus CPU may still increment while in WFI/WFE state.
> + * This is an implementation specific behavior and not an erratum.
> + *
> + * From ARM DDI0487 D14.4:
> + * It is IMPLEMENTATION SPECIFIC whether CPU_CYCLES and PMCCNTR count
> + * when the PE is in WFI or WFE state, even if the clocks are not stopped.
> + *
> + * From ARM DDI0487 D24.5.2:
> + * All counters are subject to any changes in clock frequency, including
> + * clock stopping caused by the WFI and WFE instructions.
> + * This means that it is CONSTRAINED UNPREDICTABLE whether or not
> + * PMCCNTR_EL0 continues to increment when clocks are stopped by WFI and
> + * WFE instructions.
> + */
> + MIDR_ALL_VERSIONS(MIDR_NVIDIA_OLYMPUS),
> + {}
> +};
> +
> +static bool armv8pmu_is_in_avoid_pmccntr_cpus(void)
> +{
> + return is_midr_in_range_list(armv8pmu_avoid_pmccntr_cpus);
> +}
> +#else
> +static bool armv8pmu_is_in_avoid_pmccntr_cpus(void)
> +{
> + return false;
> +}
> +#endif
> +
> struct armv8pmu_probe_info {
> struct arm_pmu *pmu;
> bool present;
> @@ -1348,6 +1378,13 @@ static void __armv8pmu_probe_pmu(void *info)
> else
> cpu_pmu->reg_pmmir = 0;
>
> + /*
> + * On some CPUs, PMCCNTR_EL0 does not match the behavior of CPU_CYCLES
> + * programmable counter, so avoid routing cycles through PMCCNTR_EL0 to
> + * prevent inconsistency in the results.
> + */
> + cpu_pmu->avoid_pmccntr |= armv8pmu_is_in_avoid_pmccntr_cpus();
> +
> brbe_probe(cpu_pmu);
> }
>
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 52b37f7bdbf9..02d2c7f45b52 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -119,7 +119,7 @@ struct arm_pmu {
>
> /* PMUv3 only */
> int pmuver;
> - bool has_smt;
> + bool avoid_pmccntr;
> u64 reg_pmmir;
> u64 reg_brbidr;
> #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
More information about the linux-arm-kernel
mailing list