[PATCH 00/22] Refactor perf cpumap

Ian Rogers irogers at google.com
Mon Dec 13 08:10:03 PST 2021


On Mon, Dec 13, 2021 at 3:39 AM James Clark <james.clark at arm.com> wrote:
>
>
>
> On 08/12/2021 02:45, Ian Rogers wrote:
> > Perf cpu map has various functions where a cpumap and index are passed
> > in order to load the cpu. A problem with this is that the wrong index
> > may be passed for the cpumap, causing problems like aggregation on the
> > wrong CPU:
> > https://lore.kernel.org/lkml/20211204023409.969668-1-irogers@google.com/
> >
> > This patch set refactors the cpu map API, greatly reducing it and
> > explicitly passing the cpu (rather than the pair) to functions that
> > need it. Comments are added at the same time.
> >
> > Ian Rogers (22):
> >   libperf: Add comments to perf_cpu_map.
> >   perf stat: Add aggr creators that are passed a cpu.
> >   perf stat: Switch aggregation to use for_each loop
> >   perf stat: Switch to cpu version of cpu_map__get
> >   perf cpumap: Switch cpu_map__build_map to cpu function
> >   perf cpumap: Remove map+index get_socket
> >   perf cpumap: Remove map+index get_die
> >   perf cpumap: Remove map+index get_core
> >   perf cpumap: Remove map+index get_node
> >   perf cpumap: Add comments to aggr_cpu_id
> >   perf cpumap: Remove unused cpu_map__socket
> >   perf cpumap: Simplify equal function name.
> >   perf cpumap: Rename empty functions.
> >   perf cpumap: Document cpu__get_node and remove redundant function
> >   perf cpumap: Remove map from function names that don't use a map.
> >   perf cpumap: Remove cpu_map__cpu, use libperf function.
> >   perf cpumap: Refactor cpu_map__build_map
> >   perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions
> >   perf cpumap: Move 'has' function to libperf
> >   perf cpumap: Add some comments to cpu_aggr_map
> >   perf cpumap: Trim the cpu_aggr_map
> >   perf stat: Fix memory leak in check_per_pkg
> >
> >  tools/lib/perf/cpumap.c                  |   7 +-
> >  tools/lib/perf/include/internal/cpumap.h |   9 +-
> >  tools/lib/perf/include/perf/cpumap.h     |   1 +
> >  tools/perf/arch/arm/util/cs-etm.c        |  16 +-
> >  tools/perf/builtin-ftrace.c              |   2 +-
> >  tools/perf/builtin-sched.c               |   6 +-
> >  tools/perf/builtin-stat.c                | 273 ++++++++++++-----------
> >  tools/perf/tests/topology.c              |  10 +-
> >  tools/perf/util/cpumap.c                 | 182 ++++++---------
> >  tools/perf/util/cpumap.h                 | 102 ++++++---
> >  tools/perf/util/cputopo.c                |   2 +-
> >  tools/perf/util/env.c                    |   6 +-
> >  tools/perf/util/stat-display.c           |  69 +++---
> >  tools/perf/util/stat.c                   |   9 +-
> >  tools/perf/util/stat.h                   |   3 +-
> >  15 files changed, 361 insertions(+), 336 deletions(-)
> >
>
> For the whole set:
>
> Reviewed-by: James Clark <james.clark at arm.com>
>
> I didn't see any obvious issues with mixing up aggregation modes or CPU/idx types. Also
> gave perf stat a test in the different modes and didn't see an issue.
>
> But I'm wondering if it's possible to go further and add a struct around the CPU int so that the
> compiler checks for correctness instead. It still seems quite easy to mix up index and
> CPU, for example these functions are subtly different, but both use int:
>
>   LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
>   LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
>
> Something like this would make it impossible to make a mistake:
>
>   struct cpu { int cpu };
>
> I mean it's more of a coincidence that CPUs can be identified by an integer, but they are more
> of an object than an integer, so it could make sense to wrap it. But maybe it could be quite
> cumbersome to use and be overkill.

Thanks James! I am working on a v2 patch set and will have a go at
adding this to the end.

Ian



More information about the linux-arm-kernel mailing list