[PATCH v2 8/8] perf pmu: Move pmu_metrics_table__find and remove ARM override

Ian Rogers irogers at google.com
Sat Nov 9 08:10:22 PST 2024


On Sat, Nov 9, 2024 at 2:54 AM Yicong Yang <yangyicong at huawei.com> wrote:
>
> Hi,
>
> On 2024/11/8 0:20, Ian Rogers wrote:
> > Move pmu_metrics_table__find to the jevents.py generated pmu-events.c
> > and remove indirection override for ARM. The movement removes
> > perf_pmu__find_metrics_table that exists to enable the ARM
> > override. The ARM override isn't necessary as just the CPUID, not PMU,
> > is used in the metric table lookup. On non-ARM the CPU argument is
> > just ignored for the CPUID, for ARM -1 is passed so that the CPUID for
> > the first logical CPU is read.
>
> Since the logic here's already been touching, is it possible to step it further to just
> ignore the CPUID matching when finding the system metrics/events tables? It's may not be
> that reasonable for finding a system metrics/events from the CPUID, since one system PMU may
> exists on different platforms with different CPU types.

The issue is for conciseness reasons we let metrics and metric
thresholds refer to other metrics, for example:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json#n78
```
    {
        "BriefDescription": "This category represents fraction of
slots where no uops are being delivered due to a lack of required
resources for accepting new uops in the Backend",
        "MetricExpr": "topdown\\-be\\-bound / (topdown\\-fe\\-bound +
topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5
* cpu at INT_MISC.RECOVERY_CYCLES\\,cmask\\=1\\,edge@ / tma_info_slots",
        "MetricGroup": "TmaL1;TopdownL1;tma_L1_group",
        "MetricName": "tma_backend_bound",
        "MetricThreshold": "tma_backend_bound > 0.2",
        "MetricgroupNoGroup": "TopdownL1",
        "PublicDescription": "This category represents fraction of
slots where no uops are being delivered due to a lack of required
resources for accepting new uops in the Backend. Backend is the
portion of the processor core where the out-of-order scheduler
dispatches ready uops into their respective execution units; and once
completed these uops get retired according to program order. For
example; stalls due to data-cache misses or stalls due to the divider
unit being overloaded are both categorized under Backend Bound.
Backend Bound is further divided into two main categories: Memory
Bound and Core Bound. Sample with: TOPDOWN.BACKEND_BOUND_SLOTS",
        "ScaleUnit": "100%"
    },
```

The system metrics were added on top of this and we never rethought
the design. For a metric to refer to another metric there needs to be
some kind of place we look up from and for that we use the CPUID
associated table. Perhaps the easiest thing is that if no CPUID table
is matched we have an empty table.

> FYI, there's a similiar problem when trying to count the system metrics but fails [1].
> I've tested with this series but the problem still exists.

Not sure what you are asking me for here.

Thanks,
Ian

> [1] https://lore.kernel.org/linux-perf-users/20241010074430.16685-1-hejunhao3@huawei.com/
>
> Thanks.
>
> >
> > Signed-off-by: Ian Rogers <irogers at google.com>
> > ---
> >  tools/perf/arch/arm64/util/pmu.c         | 20 --------------------
> >  tools/perf/pmu-events/empty-pmu-events.c | 10 ++++------
> >  tools/perf/pmu-events/jevents.py         | 10 ++++------
> >  tools/perf/pmu-events/pmu-events.h       |  2 +-
> >  tools/perf/util/pmu.c                    |  5 -----
> >  tools/perf/util/pmu.h                    |  1 -
> >  6 files changed, 9 insertions(+), 39 deletions(-)
> >
> > diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
> > index a0964b191fcb..895fb0d0610c 100644
> > --- a/tools/perf/arch/arm64/util/pmu.c
> > +++ b/tools/perf/arch/arm64/util/pmu.c
> > @@ -1,29 +1,9 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >
> > -#include <internal/cpumap.h>
> > -#include "../../../util/cpumap.h"
> > -#include "../../../util/header.h"
> >  #include "../../../util/pmu.h"
> >  #include "../../../util/pmus.h"
> >  #include "../../../util/tool_pmu.h"
> >  #include <api/fs/fs.h>
> > -#include <math.h>
> > -
> > -const struct pmu_metrics_table *pmu_metrics_table__find(void)
> > -{
> > -     struct perf_pmu *pmu;
> > -
> > -     /* Metrics aren't currently supported on heterogeneous Arm systems */
> > -     if (perf_pmus__num_core_pmus() > 1)
> > -             return NULL;
> > -
> > -     /* Doesn't matter which one here because they'll all be the same */
> > -     pmu = perf_pmus__find_core_pmu();
> > -     if (pmu)
> > -             return perf_pmu__find_metrics_table(pmu);
> > -
> > -     return NULL;
> > -}
> >
> >  u64 tool_pmu__cpu_slots_per_cycle(void)
> >  {
> > diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c
> > index 17306e316a3c..1c7a2cfa321f 100644
> > --- a/tools/perf/pmu-events/empty-pmu-events.c
> > +++ b/tools/perf/pmu-events/empty-pmu-events.c
> > @@ -587,14 +587,12 @@ const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu)
> >          return NULL;
> >  }
> >
> > -const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu *pmu)
> > +const struct pmu_metrics_table *pmu_metrics_table__find(void)
> >  {
> > -        const struct pmu_events_map *map = map_for_pmu(pmu);
> > -
> > -        if (!map)
> > -                return NULL;
> > +        struct perf_cpu cpu = {-1};
> > +        const struct pmu_events_map *map = map_for_cpu(cpu);
> >
> > -     return &map->metric_table;
> > +        return map ? &map->metric_table : NULL;
> >  }
> >
> >  const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
> > diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
> > index e44b72e56ac3..d781a377757a 100755
> > --- a/tools/perf/pmu-events/jevents.py
> > +++ b/tools/perf/pmu-events/jevents.py
> > @@ -1103,14 +1103,12 @@ const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu)
> >          return NULL;
> >  }
> >
> > -const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu *pmu)
> > +const struct pmu_metrics_table *pmu_metrics_table__find(void)
> >  {
> > -        const struct pmu_events_map *map = map_for_pmu(pmu);
> > -
> > -        if (!map)
> > -                return NULL;
> > +        struct perf_cpu cpu = {-1};
> > +        const struct pmu_events_map *map = map_for_cpu(cpu);
> >
> > -     return &map->metric_table;
> > +        return map ? &map->metric_table : NULL;
> >  }
> >
> >  const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
> > diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
> > index 5435ad92180c..675562e6f770 100644
> > --- a/tools/perf/pmu-events/pmu-events.h
> > +++ b/tools/perf/pmu-events/pmu-events.h
> > @@ -103,7 +103,7 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, pm
> >                                    void *data);
> >
> >  const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu);
> > -const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu *pmu);
> > +const struct pmu_metrics_table *pmu_metrics_table__find(void);
> >  const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid);
> >  const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const char *cpuid);
> >  int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data);
> > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> > index 514cb865f57b..45838651b361 100644
> > --- a/tools/perf/util/pmu.c
> > +++ b/tools/perf/util/pmu.c
> > @@ -818,11 +818,6 @@ static int is_sysfs_pmu_core(const char *name)
> >       return file_available(path);
> >  }
> >
> > -__weak const struct pmu_metrics_table *pmu_metrics_table__find(void)
> > -{
> > -     return perf_pmu__find_metrics_table(NULL);
> > -}
> > -
> >  /**
> >   * Return the length of the PMU name not including the suffix for uncore PMUs.
> >   *
> > diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> > index fba3fc608b64..7b3e71194e49 100644
> > --- a/tools/perf/util/pmu.h
> > +++ b/tools/perf/util/pmu.h
> > @@ -260,7 +260,6 @@ void perf_pmu__arch_init(struct perf_pmu *pmu);
> >  void pmu_add_cpu_aliases_table(struct perf_pmu *pmu,
> >                              const struct pmu_events_table *table);
> >
> > -const struct pmu_metrics_table *pmu_metrics_table__find(void);
> >  bool pmu_uncore_identifier_match(const char *compat, const char *id);
> >
> >  int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
> >



More information about the linux-riscv mailing list