[PATCH] perf vendor events arm64: Add Tegra410 Olympus PMU events

James Clark james.clark at linaro.org
Mon Feb 2 01:58:53 PST 2026



On 30/01/2026 6:08 pm, Besar Wicaksono wrote:
> Thank you James and Ian for the comments.
> I will try to address the spelling mistakes on v2.
> 
> Please see my other comments inline.
> 
>> -----Original Message-----
>> From: James Clark <james.clark at linaro.org>
>> Sent: Wednesday, January 28, 2026 3:37 AM
>> To: Ian Rogers <irogers at google.com>; Besar Wicaksono
>> <bwicaksono at nvidia.com>
>> Cc: john.g.garry at oracle.com; will at kernel.org; mike.leach at linaro.org;
>> leo.yan at linux.dev; mark.rutland at arm.com;
>> alexander.shishkin at linux.intel.com; jolsa at kernel.org;
>> adrian.hunter at intel.com; peterz at infradead.org; mingo at redhat.com;
>> acme at kernel.org; namhyung at kernel.org; linux-tegra at vger.kernel.org; linux-
>> arm-kernel at lists.infradead.org; linux-perf-users at vger.kernel.org; linux-
>> kernel at vger.kernel.org; Thomas Makin <tmakin at nvidia.com>; Vikram Sethi
>> <vsethi at nvidia.com>; Rich Wiley <rwiley at nvidia.com>; Sean Kelley
>> <skelley at nvidia.com>; Yifei Wan <ywan at nvidia.com>; Thierry Reding
>> <treding at nvidia.com>; Jon Hunter <jonathanh at nvidia.com>; Matt Ochs
>> <mochs at nvidia.com>
>> Subject: Re: [PATCH] perf vendor events arm64: Add Tegra410 Olympus PMU
>> events
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 28/01/2026 8:03 am, Ian Rogers wrote:
>>> On Tue, Jan 27, 2026 at 3:00 PM Besar Wicaksono
>> <bwicaksono at nvidia.com> wrote:
>>>>
>>>> Add JSON files for NVIDIA Tegra410 Olympus core PMU events.
>>>> Also updated the common-and-microarch.json.
>>>>
>>>> Signed-off-by: Besar Wicaksono <bwicaksono at nvidia.com>
>>>> ---
>>>>    .../arch/arm64/common-and-microarch.json      |  90 +++
>>>>    tools/perf/pmu-events/arch/arm64/mapfile.csv  |   1 +
>>>>    .../arch/arm64/nvidia/t410/branch.json        |  45 ++
>>>>    .../arch/arm64/nvidia/t410/brbe.json          |   6 +
>>>>    .../arch/arm64/nvidia/t410/bus.json           |  48 ++
>>>>    .../arch/arm64/nvidia/t410/exception.json     |  62 ++
>>>>    .../arch/arm64/nvidia/t410/fp_operation.json  |  78 ++
>>>>    .../arch/arm64/nvidia/t410/general.json       |  15 +
>>>>    .../arch/arm64/nvidia/t410/l1d_cache.json     | 122 +++
>>>>    .../arch/arm64/nvidia/t410/l1i_cache.json     | 114 +++
>>>>    .../arch/arm64/nvidia/t410/l2d_cache.json     | 134 ++++
>>>>    .../arch/arm64/nvidia/t410/ll_cache.json      | 107 +++
>>>>    .../arch/arm64/nvidia/t410/memory.json        |  46 ++
>>>>    .../arch/arm64/nvidia/t410/metrics.json       | 722
>> ++++++++++++++++++
>>>>    .../arch/arm64/nvidia/t410/misc.json          | 646 ++++++++++++++++
>>>>    .../arch/arm64/nvidia/t410/retired.json       |  94 +++
>>>>    .../arch/arm64/nvidia/t410/spe.json           |  42 +
>>>>    .../arm64/nvidia/t410/spec_operation.json     | 230 ++++++
>>>>    .../arch/arm64/nvidia/t410/stall.json         | 145 ++++
>>>>    .../arch/arm64/nvidia/t410/tlb.json           | 158 ++++
>>>>    20 files changed, 2905 insertions(+)
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/branch.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/brbe.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/bus.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/exception.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/fp_operation.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/general.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/l1d_cache.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/l1i_cache.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/l2d_cache.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/ll_cache.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/memory.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/metrics.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/misc.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/retired.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/spe.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/spec_operation.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/stall.json
>>>>    create mode 100644 tools/perf/pmu-
>> events/arch/arm64/nvidia/t410/tlb.json
>>>>
>>>> diff --git a/tools/perf/pmu-events/arch/arm64/common-and-
>> microarch.json b/tools/perf/pmu-events/arch/arm64/common-and-
>> microarch.json
>>>> index 468cb085d879..6af15776ff17 100644
>>>> --- a/tools/perf/pmu-events/arch/arm64/common-and-microarch.json
>>>> +++ b/tools/perf/pmu-events/arch/arm64/common-and-microarch.json
>>>> @@ -179,6 +179,11 @@
>>>>            "EventName": "BUS_CYCLES",
>>>>            "BriefDescription": "Bus cycle"
>>>>        },
>>>> +    {
>>>> +        "EventCode": "0x001E",
>>>> +        "EventName": "CHAIN",
>>>> +        "BriefDescription": "Chain a pair of event counters."
>>>> +    },
>>>
>>> Cool stuff :-)
>>>
>>> For wider counters AMD does something similar, but should this be an
>>> implementation detail rather than an exposed event? How does it
>>> operate as an event?
>>>
>>
>> CHAIN should be excluded from the json, it's used internally by the
>> driver when you want 64 bit counters. Userspace can't use it because it
>> can't control where counters are placed to make sure they're adjacent.
>>
> 
> Sure, will address this in V2.
> 
>>> [snip]
>>>> diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv
>> b/tools/perf/pmu-events/arch/arm64/mapfile.csv
>>>> index bb3fa8a33496..7f0eaa702048 100644
>>>> --- a/tools/perf/pmu-events/arch/arm64/mapfile.csv
>>>> +++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
>>>> @@ -46,3 +46,4 @@
>>>>    0x00000000500f0000,v1,ampere/emag,core
>>>>    0x00000000c00fac30,v1,ampere/ampereone,core
>>>>    0x00000000c00fac40,v1,ampere/ampereonex,core
>>>> +0x000000004e0f0100,v1,nvidia/t410,core
>>>> diff --git a/tools/perf/pmu-events/arch/arm64/nvidia/t410/branch.json
>> b/tools/perf/pmu-events/arch/arm64/nvidia/t410/branch.json
>>>> new file mode 100644
>>>> index 000000000000..532bc59dc573
>>>> --- /dev/null
>>>> +++ b/tools/perf/pmu-events/arch/arm64/nvidia/t410/branch.json
>>>> @@ -0,0 +1,45 @@
>>>> +[
>>>> +    {
>>>> +        "ArchStdEvent": "BR_MIS_PRED",
>>>> +        "PublicDescription": "The Event counts Branches which are
>> speculatively executed and mis-predicted."
>>>
>>> nit: The capitalization on Event and Branches, as well as other words,
>>> is a little unusual.
>>>
>>
>> If there's nothing specific to this CPU then the public description
>> could be left out entierly. The common strings already say the same
>> thing as this:
>>
>>       {
>>           "PublicDescription": "Mispredicted or not predicted branch
>>                                 speculatively executed",
>>           "EventCode": "0x10",
>>           "EventName": "BR_MIS_PRED",
>>           "BriefDescription": "Mispredicted or not predicted branch
>>                                speculatively executed"
>>       },
>>
>>
> 
> I will check on this and other events.
> 
>>> [snip]
>>>> diff --git a/tools/perf/pmu-events/arch/arm64/nvidia/t410/metrics.json
>> b/tools/perf/pmu-events/arch/arm64/nvidia/t410/metrics.json
>>>> new file mode 100644
>>>> index 000000000000..18c2fd58ee9e
>>>> --- /dev/null
>>>> +++ b/tools/perf/pmu-events/arch/arm64/nvidia/t410/metrics.json
>>>> @@ -0,0 +1,722 @@
>>>> +[
>>>> +    {
>>>> +        "MetricName": "backend_bound",
>>>> +        "MetricExpr": "100 * (STALL_SLOT_BACKEND / CPU_SLOT)",
>>>> +        "BriefDescription": "This metric is the percentage of total slots that
>> were stalled due to resource constraints in the backend of the processor.",
>>>> +        "ScaleUnit": "1percent of slots",
>>>> +        "MetricGroup": "TopdownL1"
>>>
>>> Note, on x86 we place TopdownL1 in the Default metric group so it is
>>> shown by `perf stat` when it isn't given events or metrics to compute.
>>> Perhaps you want to do something similar?
>>> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-
>> next.git/tree/tools/perf/pmu-events/arch/x86/icelake/icl-
>> metrics.json?h=tmp.perf-tools-next#n116
>>>
>>> Thanks,
>>> Ian
>>
>> That's currently true for the Arm topdown metrics in sbsa.json, but we
>> probably don't actually want this because there aren't enough counters
>> for the perf stat defaults plus topdown and it results in multiplexing
>> and bad results.
>>
>> I was planning to change it but we also don't currently expose the
>> number of counters available either...
>>
> 
> Agree with James, we may not be able to cover all the events in
> Default + TopdownL1 group.
> 
> Ian, James, in general, is it fine to put some metrics in a same group because
> they are actually correlating even though it may cause multiplexing?
> For example, "Miss_Ratio" group in this patch provides miss ratio from L1, L2, TLB, etc.
> It's convenient to have this and get the miss ratio from all cache levels.
> There are many events needed for this group and it might not be accurate due to
> multiplexing, but user can get a (rough) broad view in the beginning.
> 
> Thanks,
> Besar

Yes I think it makes sense to put them all in the same group if they are 
the same feature. I suppose my point about possibly not wanting to open 
TopdownL1 by default is a separate issue to whether they should be in 
the group or not. I just thought it was a good time to mention it.





More information about the linux-arm-kernel mailing list