[PATCH v2] perf/arm_pmu: Skip PMCCNTR_EL0 on NVIDIA Olympus

Besar Wicaksono bwicaksono at nvidia.com
Wed Apr 22 13:17:11 PDT 2026



> -----Original Message-----
> From: James Clark <james.clark at linaro.org>
> Sent: Wednesday, April 22, 2026 5:33 AM
> To: Besar Wicaksono <bwicaksono at nvidia.com>; will at kernel.org;
> mark.rutland at arm.com
> Cc: linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org; linux-
> tegra at vger.kernel.org; Thierry Reding <treding at nvidia.com>; Jon Hunter
> <jonathanh at nvidia.com>; Vikram Sethi <vsethi at nvidia.com>; Rich Wiley
> <rwiley at nvidia.com>; Shanker Donthineni <sdonthineni at nvidia.com>; Matt
> Ochs <mochs at nvidia.com>; Nirmoy Das <nirmoyd at nvidia.com>; Sean Kelley
> <skelley at nvidia.com>
> Subject: Re: [PATCH v2] perf/arm_pmu: Skip PMCCNTR_EL0 on NVIDIA
> Olympus
> 
> External email: Use caution opening links or attachments
> 
> 
> On 21/04/2026 21:38, Besar Wicaksono wrote:
> > The PMCCNTR_EL0 in NVIDIA Olympus CPU may increment while
> > in WFI/WFE, which does not align with counting CPU_CYCLES
> > on a programmable counter. Add a MIDR range entry and
> > refuse PMCCNTR_EL0 for cycle events on affected parts so
> > perf does not mix the two behaviors.
> >
> > Signed-off-by: Besar Wicaksono <bwicaksono at nvidia.com>
> > ---
> >
> > Changes from v1:
> >    * add CONFIG_ARM64 check to fix build error found by kernel test robot
> >    * add explicit include of <asm/cputype.h>
> > v1: https://lore.kernel.org/linux-arm-kernel/20260406232034.2566133-1-
> bwicaksono at nvidia.com/
> >
> > ---
> >   drivers/perf/arm_pmuv3.c | 44
> ++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 44 insertions(+)
> >
> > diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> > index 8014ff766cff..7c39d0804b9f 100644
> > --- a/drivers/perf/arm_pmuv3.c
> > +++ b/drivers/perf/arm_pmuv3.c
> > @@ -8,6 +8,7 @@
> >    * This code is based heavily on the ARMv7 perf event code.
> >    */
> >
> > +#include <asm/cputype.h>
> >   #include <asm/irq_regs.h>
> >   #include <asm/perf_event.h>
> >   #include <asm/virt.h>
> > @@ -978,6 +979,41 @@ static int armv8pmu_get_chain_idx(struct
> pmu_hw_events *cpuc,
> >       return -EAGAIN;
> >   }
> >
> > +#ifdef CONFIG_ARM64
> > +/*
> > + * List of CPUs that should avoid using PMCCNTR_EL0.
> > + */
> > +static struct midr_range armv8pmu_avoid_pmccntr_cpus[] = {
> > +     /*
> > +      * The PMCCNTR_EL0 in Olympus CPU may still increment while in
> WFI/WFE state.
> > +      * This is an implementation specific behavior and not an erratum.
> > +      *
> > +      * From ARM DDI0487 D14.4:
> > +      *   It is IMPLEMENTATION SPECIFIC whether CPU_CYCLES and PMCCNTR
> count
> > +      *   when the PE is in WFI or WFE state, even if the clocks are not stopped.
> > +      *
> > +      * From ARM DDI0487 D24.5.2:
> > +      *   All counters are subject to any changes in clock frequency, including
> > +      *   clock stopping caused by the WFI and WFE instructions.
> > +      *   This means that it is CONSTRAINED UNPREDICTABLE whether or not
> > +      *   PMCCNTR_EL0 continues to increment when clocks are stopped by
> WFI and
> > +      *   WFE instructions.
> > +      */
> > +     MIDR_ALL_VERSIONS(MIDR_NVIDIA_OLYMPUS),
> > +     {}
> > +};
> > +
> > +static bool armv8pmu_is_in_avoid_pmccntr_cpus(void)
> > +{
> > +     return is_midr_in_range_list(armv8pmu_avoid_pmccntr_cpus);
> > +}
> > +#else
> > +static bool armv8pmu_is_in_avoid_pmccntr_cpus(void)
> > +{
> > +     return false;
> > +}
> > +#endif
> > +
> >   static bool armv8pmu_can_use_pmccntr(struct pmu_hw_events *cpuc,
> >                                    struct perf_event *event)
> >   {
> > @@ -1011,6 +1047,14 @@ static bool armv8pmu_can_use_pmccntr(struct
> pmu_hw_events *cpuc,
> >       if (cpu_pmu->has_smt)
> >               return false;
> >
> > +     /*
> > +      * On some CPUs, PMCCNTR_EL0 does not match the behavior of
> CPU_CYCLES
> > +      * programmable counter, so avoid routing cycles through PMCCNTR_EL0
> to
> > +      * prevent inconsistency in the results.
> > +      */
> > +     if (armv8pmu_is_in_avoid_pmccntr_cpus())
> > +             return false;
> > +
> 
> Hi Besar,
> 
> This is called from armpmu_event_init() before the event is scheduled on
> the CPU so I don't think reading the MIDR at this point is safe.
> 
> When the PMU is probed you probably need to do an SMP call to get the
> MIDR of CPUs in that PMU's mask and then cache the "avoid pmccntr"
> result like has_smt. Or even rename has_smt to avoid_pmccntr and combine
> the two results there.
> 
> I don't know what will happen if none of those CPUs are online when the
> PMU is probed though...
> 

Hi James,

has_smt, iiuc, is common to all the supported CPUs of the PMU context.
It is configured based on the first CPU in supported cpu list.

    pmu->has_smt = topology_core_has_smt(cpumask_first(&pmu->supported_cpus));

Is it okay to use same approach? Can we assume all CPUs in supported_cpus have same midr?

Thanks,
Besar






More information about the linux-arm-kernel mailing list