[PATCH v4] perf/arm_pmu: Skip PMCCNTR_EL0 on NVIDIA Olympus

Besar Wicaksono bwicaksono at nvidia.com
Mon Jun 8 17:01:31 PDT 2026


Hi Will,

My apology for taking a while to respond.
Please see my reply inline.

> -----Original Message-----
> From: Will Deacon <will at kernel.org>
> Sent: Tuesday, May 19, 2026 6:41 AM
> To: Besar Wicaksono <bwicaksono at nvidia.com>
> Cc: mark.rutland at arm.com; james.clark at linaro.org; yangyccccc at gmail.com;
> linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org; linux-
> tegra at vger.kernel.org; Thierry Reding <treding at nvidia.com>; Jon Hunter
> <jonathanh at nvidia.com>; Vikram Sethi <vsethi at nvidia.com>; Rich Wiley
> <rwiley at nvidia.com>; Shanker Donthineni <sdonthineni at nvidia.com>; Matt
> Ochs <mochs at nvidia.com>; Nirmoy Das <nirmoyd at nvidia.com>; Sean Kelley
> <skelley at nvidia.com>
> Subject: Re: [PATCH v4] perf/arm_pmu: Skip PMCCNTR_EL0 on NVIDIA
> Olympus
> 
> External email: Use caution opening links or attachments
> 
> 
> On Mon, May 04, 2026 at 05:52:04PM +0000, Besar Wicaksono wrote:
> > PMCCNTR_EL0 may continue to increment on NVIDIA Olympus CPUs while
> the
> > PE is in WFI/WFE. That does not necessarily match the CPU_CYCLES event
> > counted by a programmable counter, so using PMCCNTR_EL0 for cycles can
> > give results that differ from the programmable counter path.
> >
> > Extend the existing PMCCNTR avoidance decision from the SMT case to
> > also cover Olympus. Store the result in the common arm_pmu state at
> > registration time, so arm_pmuv3 can keep using a single flag when
> > deciding whether CPU_CYCLES may use PMCCNTR_EL0.
> >
> > Signed-off-by: Besar Wicaksono <bwicaksono at nvidia.com>
> > ---
> >
> > Changes from v1:
> >   * add CONFIG_ARM64 check to fix build error found by kernel test robot
> >   * add explicit include of <asm/cputype.h>
> > v1: https://lore.kernel.org/linux-arm-kernel/20260406232034.2566133-1-
> bwicaksono at nvidia.com/
> >
> > Changes from v2:
> >   * Move the Olympus PMCCNTR avoidance check from arm_pmuv3.c to the
> >     common arm_pmu registration path.
> >   * Replace the PMUv3-only has_smt flag with avoid_pmccntr, covering both
> >     the existing SMT restriction and the Olympus MIDR restriction.
> >   * Use the cached per-CPU MIDR from cpu_data instead of calling
> >     is_midr_in_range_list() from armv8pmu_can_use_pmccntr().
> >   * Add the required asm/cpu.h include for cpu_data.
> > v2: https://lore.kernel.org/linux-arm-kernel/20260421203856.3539186-1-
> bwicaksono at nvidia.com/#t
> >
> > Changes from v3:
> >   * Move avoidance check based on MIDR to __armv8pmu_probe_pmu() to
> make sure
> >     the MIDR is retrieved from the correct online CPU.
> > v3: https://lore.kernel.org/linux-arm-kernel/20260429215614.1793131-1-
> bwicaksono at nvidia.com/
> >
> > ---
> >  drivers/perf/arm_pmu.c       |  7 ++++-
> >  drivers/perf/arm_pmuv3.c     | 51
> +++++++++++++++++++++++++++++++-----
> >  include/linux/perf/arm_pmu.h |  2 +-
> >  3 files changed, 51 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 939bcbd433aa..aa1dac0b440f 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -931,8 +931,13 @@ int armpmu_register(struct arm_pmu *pmu)
> >       /*
> >        * By this stage we know our supported CPUs on either DT/ACPI
> platforms,
> >        * detect the SMT implementation.
> > +      * On SMT CPUs, the PMCCNTR_EL0 increments from the processor clock
> rather
> > +      * than the PE clock (ARM DDI0487 L.b D13.1.3) which means it'll
> continue
> > +      * counting on a WFI PE if one of its SMT sibling is not idle on a
> > +      * multi-threaded implementation. So don't use it on SMT cores.
> >        */
> > -     pmu->has_smt = topology_core_has_smt(cpumask_first(&pmu-
> >supported_cpus));
> > +     pmu->avoid_pmccntr |=
> > +             topology_core_has_smt(cpumask_first(&pmu->supported_cpus));
> >
> >       if (!pmu->set_event_filter)
> >               pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
> > diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> > index 8014ff766cff..1ee4a09d0dcc 100644
> > --- a/drivers/perf/arm_pmuv3.c
> > +++ b/drivers/perf/arm_pmuv3.c
> > @@ -8,6 +8,7 @@
> >   * This code is based heavily on the ARMv7 perf event code.
> >   */
> >
> > +#include <asm/cputype.h>
> >  #include <asm/irq_regs.h>
> >  #include <asm/perf_event.h>
> >  #include <asm/virt.h>
> > @@ -1002,13 +1003,7 @@ static bool armv8pmu_can_use_pmccntr(struct
> pmu_hw_events *cpuc,
> >       if (has_branch_stack(event))
> >               return false;
> >
> > -     /*
> > -      * The PMCCNTR_EL0 increments from the processor clock rather than
> > -      * the PE clock (ARM DDI0487 L.b D13.1.3) which means it'll continue
> > -      * counting on a WFI PE if one of its SMT sibling is not idle on a
> > -      * multi-threaded implementation. So don't use it on SMT cores.
> > -      */
> > -     if (cpu_pmu->has_smt)
> > +     if (cpu_pmu->avoid_pmccntr)
> >               return false;
> >
> >       return true;
> > @@ -1299,6 +1294,41 @@ static int armv8_vulcan_map_event(struct
> perf_event *event)
> >                                      &armv8_vulcan_perf_cache_map);
> >  }
> >
> > +#ifdef CONFIG_ARM64
> > +/*
> > + * List of CPUs that should avoid using PMCCNTR_EL0.
> > + */
> > +static struct midr_range armv8pmu_avoid_pmccntr_cpus[] = {
> > +     /*
> > +      * The PMCCNTR_EL0 in Olympus CPU may still increment while in
> WFI/WFE state.
> > +      * This is an implementation specific behavior and not an erratum.
> > +      *
> > +      * From ARM DDI0487 D14.4:
> > +      *   It is IMPLEMENTATION SPECIFIC whether CPU_CYCLES and PMCCNTR
> count
> > +      *   when the PE is in WFI or WFE state, even if the clocks are not stopped.
> 
> So surely the weird part here is that Olypmus chose one behaviour for
> PMCCNTR and another for the CPU_CYCLES event? The Arm ARM text isn't

That is correct.

> clear to me as to whether that's permitted but I think we should call
> it out here.
> 

Sure, I will call it out explicitly.

> > +      * From ARM DDI0487 D24.5.2:
> > +      *   All counters are subject to any changes in clock frequency, including
> > +      *   clock stopping caused by the WFI and WFE instructions.
> > +      *   This means that it is CONSTRAINED UNPREDICTABLE whether or not
> > +      *   PMCCNTR_EL0 continues to increment when clocks are stopped by
> WFI and
> > +      *   WFE instructions.
> > +      */
> > +     MIDR_ALL_VERSIONS(MIDR_NVIDIA_OLYMPUS),
> > +     {}
> > +};
> > +
> > +static bool armv8pmu_is_in_avoid_pmccntr_cpus(void)
> > +{
> > +     return is_midr_in_range_list(armv8pmu_avoid_pmccntr_cpus);
> > +}
> > +#else
> > +static bool armv8pmu_is_in_avoid_pmccntr_cpus(void)
> > +{
> > +     return false;
> > +}
> > +#endif
> > +
> >  struct armv8pmu_probe_info {
> >       struct arm_pmu *pmu;
> >       bool present;
> > @@ -1348,6 +1378,13 @@ static void __armv8pmu_probe_pmu(void
> *info)
> >       else
> >               cpu_pmu->reg_pmmir = 0;
> >
> > +     /*
> > +      * On some CPUs, PMCCNTR_EL0 does not match the behavior of
> CPU_CYCLES
> > +      * programmable counter, so avoid routing cycles through PMCCNTR_EL0
> to
> > +      * prevent inconsistency in the results.
> > +      */
> > +     cpu_pmu->avoid_pmccntr |= armv8pmu_is_in_avoid_pmccntr_cpus();
> 
> Do we also want to hide the cycle counter from userspace? It sounds like
> it's going to get very confused if it tries to use it...
> 

Makes sense. I tried making the change on v5.
Please check https://lore.kernel.org/linux-arm-kernel/20260608234135.1856911-1-bwicaksono@nvidia.com/T/#u

Thanks,
Besar




More information about the linux-arm-kernel mailing list