[PATCH] perf vendor events arm64: Add Tegra410 Olympus PMU events

Besar Wicaksono bwicaksono at nvidia.com
Fri Jan 30 10:08:45 PST 2026


Thank you James and Ian for the comments.
I will try to address the spelling mistakes on v2.

Please see my other comments inline.

> -----Original Message-----
> From: James Clark <james.clark at linaro.org>
> Sent: Wednesday, January 28, 2026 3:37 AM
> To: Ian Rogers <irogers at google.com>; Besar Wicaksono
> <bwicaksono at nvidia.com>
> Cc: john.g.garry at oracle.com; will at kernel.org; mike.leach at linaro.org;
> leo.yan at linux.dev; mark.rutland at arm.com;
> alexander.shishkin at linux.intel.com; jolsa at kernel.org;
> adrian.hunter at intel.com; peterz at infradead.org; mingo at redhat.com;
> acme at kernel.org; namhyung at kernel.org; linux-tegra at vger.kernel.org; linux-
> arm-kernel at lists.infradead.org; linux-perf-users at vger.kernel.org; linux-
> kernel at vger.kernel.org; Thomas Makin <tmakin at nvidia.com>; Vikram Sethi
> <vsethi at nvidia.com>; Rich Wiley <rwiley at nvidia.com>; Sean Kelley
> <skelley at nvidia.com>; Yifei Wan <ywan at nvidia.com>; Thierry Reding
> <treding at nvidia.com>; Jon Hunter <jonathanh at nvidia.com>; Matt Ochs
> <mochs at nvidia.com>
> Subject: Re: [PATCH] perf vendor events arm64: Add Tegra410 Olympus PMU
> events
> 
> External email: Use caution opening links or attachments
> 
> 
> On 28/01/2026 8:03 am, Ian Rogers wrote:
> > On Tue, Jan 27, 2026 at 3:00 PM Besar Wicaksono
> <bwicaksono at nvidia.com> wrote:
> >>
> >> Add JSON files for NVIDIA Tegra410 Olympus core PMU events.
> >> Also updated the common-and-microarch.json.
> >>
> >> Signed-off-by: Besar Wicaksono <bwicaksono at nvidia.com>
> >> ---
> >>   .../arch/arm64/common-and-microarch.json      |  90 +++
> >>   tools/perf/pmu-events/arch/arm64/mapfile.csv  |   1 +
> >>   .../arch/arm64/nvidia/t410/branch.json        |  45 ++
> >>   .../arch/arm64/nvidia/t410/brbe.json          |   6 +
> >>   .../arch/arm64/nvidia/t410/bus.json           |  48 ++
> >>   .../arch/arm64/nvidia/t410/exception.json     |  62 ++
> >>   .../arch/arm64/nvidia/t410/fp_operation.json  |  78 ++
> >>   .../arch/arm64/nvidia/t410/general.json       |  15 +
> >>   .../arch/arm64/nvidia/t410/l1d_cache.json     | 122 +++
> >>   .../arch/arm64/nvidia/t410/l1i_cache.json     | 114 +++
> >>   .../arch/arm64/nvidia/t410/l2d_cache.json     | 134 ++++
> >>   .../arch/arm64/nvidia/t410/ll_cache.json      | 107 +++
> >>   .../arch/arm64/nvidia/t410/memory.json        |  46 ++
> >>   .../arch/arm64/nvidia/t410/metrics.json       | 722
> ++++++++++++++++++
> >>   .../arch/arm64/nvidia/t410/misc.json          | 646 ++++++++++++++++
> >>   .../arch/arm64/nvidia/t410/retired.json       |  94 +++
> >>   .../arch/arm64/nvidia/t410/spe.json           |  42 +
> >>   .../arm64/nvidia/t410/spec_operation.json     | 230 ++++++
> >>   .../arch/arm64/nvidia/t410/stall.json         | 145 ++++
> >>   .../arch/arm64/nvidia/t410/tlb.json           | 158 ++++
> >>   20 files changed, 2905 insertions(+)
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/branch.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/brbe.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/bus.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/exception.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/fp_operation.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/general.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/l1d_cache.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/l1i_cache.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/l2d_cache.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/ll_cache.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/memory.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/metrics.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/misc.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/retired.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/spe.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/spec_operation.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/stall.json
> >>   create mode 100644 tools/perf/pmu-
> events/arch/arm64/nvidia/t410/tlb.json
> >>
> >> diff --git a/tools/perf/pmu-events/arch/arm64/common-and-
> microarch.json b/tools/perf/pmu-events/arch/arm64/common-and-
> microarch.json
> >> index 468cb085d879..6af15776ff17 100644
> >> --- a/tools/perf/pmu-events/arch/arm64/common-and-microarch.json
> >> +++ b/tools/perf/pmu-events/arch/arm64/common-and-microarch.json
> >> @@ -179,6 +179,11 @@
> >>           "EventName": "BUS_CYCLES",
> >>           "BriefDescription": "Bus cycle"
> >>       },
> >> +    {
> >> +        "EventCode": "0x001E",
> >> +        "EventName": "CHAIN",
> >> +        "BriefDescription": "Chain a pair of event counters."
> >> +    },
> >
> > Cool stuff :-)
> >
> > For wider counters AMD does something similar, but should this be an
> > implementation detail rather than an exposed event? How does it
> > operate as an event?
> >
> 
> CHAIN should be excluded from the json, it's used internally by the
> driver when you want 64 bit counters. Userspace can't use it because it
> can't control where counters are placed to make sure they're adjacent.
> 

Sure, will address this in V2.

> > [snip]
> >> diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv
> b/tools/perf/pmu-events/arch/arm64/mapfile.csv
> >> index bb3fa8a33496..7f0eaa702048 100644
> >> --- a/tools/perf/pmu-events/arch/arm64/mapfile.csv
> >> +++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
> >> @@ -46,3 +46,4 @@
> >>   0x00000000500f0000,v1,ampere/emag,core
> >>   0x00000000c00fac30,v1,ampere/ampereone,core
> >>   0x00000000c00fac40,v1,ampere/ampereonex,core
> >> +0x000000004e0f0100,v1,nvidia/t410,core
> >> diff --git a/tools/perf/pmu-events/arch/arm64/nvidia/t410/branch.json
> b/tools/perf/pmu-events/arch/arm64/nvidia/t410/branch.json
> >> new file mode 100644
> >> index 000000000000..532bc59dc573
> >> --- /dev/null
> >> +++ b/tools/perf/pmu-events/arch/arm64/nvidia/t410/branch.json
> >> @@ -0,0 +1,45 @@
> >> +[
> >> +    {
> >> +        "ArchStdEvent": "BR_MIS_PRED",
> >> +        "PublicDescription": "The Event counts Branches which are
> speculatively executed and mis-predicted."
> >
> > nit: The capitalization on Event and Branches, as well as other words,
> > is a little unusual.
> >
> 
> If there's nothing specific to this CPU then the public description
> could be left out entierly. The common strings already say the same
> thing as this:
> 
>      {
>          "PublicDescription": "Mispredicted or not predicted branch
>                                speculatively executed",
>          "EventCode": "0x10",
>          "EventName": "BR_MIS_PRED",
>          "BriefDescription": "Mispredicted or not predicted branch
>                               speculatively executed"
>      },
> 
> 

I will check on this and other events.

> > [snip]
> >> diff --git a/tools/perf/pmu-events/arch/arm64/nvidia/t410/metrics.json
> b/tools/perf/pmu-events/arch/arm64/nvidia/t410/metrics.json
> >> new file mode 100644
> >> index 000000000000..18c2fd58ee9e
> >> --- /dev/null
> >> +++ b/tools/perf/pmu-events/arch/arm64/nvidia/t410/metrics.json
> >> @@ -0,0 +1,722 @@
> >> +[
> >> +    {
> >> +        "MetricName": "backend_bound",
> >> +        "MetricExpr": "100 * (STALL_SLOT_BACKEND / CPU_SLOT)",
> >> +        "BriefDescription": "This metric is the percentage of total slots that
> were stalled due to resource constraints in the backend of the processor.",
> >> +        "ScaleUnit": "1percent of slots",
> >> +        "MetricGroup": "TopdownL1"
> >
> > Note, on x86 we place TopdownL1 in the Default metric group so it is
> > shown by `perf stat` when it isn't given events or metrics to compute.
> > Perhaps you want to do something similar?
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-
> next.git/tree/tools/perf/pmu-events/arch/x86/icelake/icl-
> metrics.json?h=tmp.perf-tools-next#n116
> >
> > Thanks,
> > Ian
> 
> That's currently true for the Arm topdown metrics in sbsa.json, but we
> probably don't actually want this because there aren't enough counters
> for the perf stat defaults plus topdown and it results in multiplexing
> and bad results.
> 
> I was planning to change it but we also don't currently expose the
> number of counters available either...
> 

Agree with James, we may not be able to cover all the events in
Default + TopdownL1 group.

Ian, James, in general, is it fine to put some metrics in a same group because 
they are actually correlating even though it may cause multiplexing?
For example, "Miss_Ratio" group in this patch provides miss ratio from L1, L2, TLB, etc.
It's convenient to have this and get the miss ratio from all cache levels.
There are many events needed for this group and it might not be accurate due to
multiplexing, but user can get a (rough) broad view in the beginning.

Thanks,
Besar


More information about the linux-arm-kernel mailing list