[PATCH v1 3/4] perf record: Skip don't fail for events that don't open

Ian Rogers irogers at google.com
Mon Nov 11 09:00:04 PST 2024


On Mon, Nov 11, 2024 at 7:49 AM James Clark <james.clark at linaro.org> wrote:
>
>
>
> On 26/10/2024 1:17 pm, Ian Rogers wrote:
> > Whilst for many tools it is an expected behavior that failure to open
> > a perf event is a failure, ARM decided to name PMU events the same as
> > legacy events and then failed to rename such events on a server uncore
> > SLC PMU. As perf's default behavior when no PMU is specified is to
> > open the event on all PMUs that advertise/"have" the event, this
> > yielded failures when trying to make the priority of legacy and
> > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > legacy event user on ARM hardware may find their event opened on an
> > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > such events which this patch implements. Rather than have the skipping
> > conditional on running on ARM, the skipping is done on all
> > architectures as such a fundamental behavioral difference could lead
> > to problems with tools built/depending on perf.
> >
> > An example of perf record failing to open events on x86 is:
> > ```
> > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > Error:
> > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > "dmesg | grep -i perf" may provide additional information.
> >
> > Error:
> > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > "dmesg | grep -i perf" may provide additional information.
> >
>
> This makes me wonder if this message was overly wordy to begin with.
> This line is fine:
>
>   Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0'
>   which will be removed.
>
> The next bit about the syscall just repeats. The exit code could be
> included on the previous line. And the dmesg bit is general advice that
> could possibly be printed once at the end.

Thanks for the time spent investigating this!

I agree on this. I wonder if we should have short and long messages,
or message+help like we do for parse events. One patch series like
this is improving EBUSY:
https://lore.kernel.org/lkml/20241106003007.2112584-1-ctshao@google.com/
The issue with printing at the end is knowing where/when the end is.
Printing once is easy enough.

> > Error:
> > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > The LLC-prefetch-read event is not supported.
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
> >
> > $ perf report --stats
> > Aggregated stats:
> >                 TOTAL events:      17255
> >                  MMAP events:        284  ( 1.6%)
> >                  COMM events:       1961  (11.4%)
> >                  EXIT events:          1  ( 0.0%)
> >                  FORK events:       1960  (11.4%)
> >                SAMPLE events:         87  ( 0.5%)
> >                 MMAP2 events:      12836  (74.4%)
> >               KSYMBOL events:         83  ( 0.5%)
> >             BPF_EVENT events:         36  ( 0.2%)
> >        FINISHED_ROUND events:          2  ( 0.0%)
> >              ID_INDEX events:          1  ( 0.0%)
> >            THREAD_MAP events:          1  ( 0.0%)
> >               CPU_MAP events:          1  ( 0.0%)
> >             TIME_CONV events:          1  ( 0.0%)
> >         FINISHED_INIT events:          1  ( 0.0%)
> > cycles stats:
> >                SAMPLE events:         87
> > ```
> >
> > Note, if all events fail to open then the data file will contain no
> > samples. This is deliberate as at the point the events are opened
> > there are other events, such as the dummy event for sideband data, and
> > these events will succeed in opening even if the user specified ones
>
> Is a file with only sideband events useful? Is it possible to fail the
> record command if the event doesn't open anywhere?
>
> I noticed this leads to some different behavior and a libperf warning
> when you have paranoid=3:
>
>    $ perf record -e cycles -C 0 -- true
>
>    Error:
>    Failure to open event 'cpu_atom/cycles/u' on PMU 'cpu_atom' which will
>    be removed.
>    ...
>    Consider adjusting /proc/sys/kernel/perf_event_paranoid setting
>    ...
>    libperf: Miscounted nr_mmaps 0 vs 28
>    WARNING: No sample_id_all support, falling back to unordered
>    processing
>    [ perf record: Woken up 1 times to write data ]
>    [ perf record: Captured and wrote 0.021 MB perf.data ]

So paranoid=3 is a Debian extension that Peter Z doesn't agree with
and so isn't enabled in regular kernels. So we're dealing with cycles,
which we'd expect to open or fall-back to a software event like
task-clock. It feels like things could happen better here but that's
not necessarily the fault of this patch.

> > don't. Having a mix of open and broken events leads to a problem of
> > identifying different sources of events.
>  >
>
> In my basic test I saw that the opened event was identified correctly in
> perf report, unless you have an example that you encountered that we
> should fix?

I didn't find any but we also don't tend to test failing to open
events. I could imagine things failing in `perf test` on neoverse
testing given the l3 advertising the cycles event.

> One place I saw an issue was with auxtrace events. If there's an event
> name clash then you're likely to not be able to open the file afterwards
> because the auxtrace code can't handle an event that didn't open. But I
> don't know of any name clashes there (I just faked one for testing), and
> maybe that could be fixed up later in the auxtrace code if there is ever
> a real one.
>
> Other than the above it does seem to work ok.

Cool, can this be taken as a Tested-by?

Thanks,
Ian

> > The issue with legacy events is that on RISC-V they want the driver to
> > not have mappings from legacy to non-legacy config encodings for each
> > vendor/model due to size, complexity and difficulty to update. It was
> > reported that on ARM Apple-M? CPUs the legacy mapping in the driver
> > was broken and the sysfs/json events should always take precedent,
> > however, it isn't clear this is still the case. It is the case that
> > without working around this issue a legacy event like cycles without a
> > PMU can encode differently than when specified with a PMU - the
> > non-PMU version favoring legacy encodings, the PMU one avoiding legacy
> > encodings.
> >
> > The patch removes events and then adjusts the idx value for each
> > evsel. This is done so that the dense xyarrays used for file
> > descriptors, etc. don't contain broken entries. As event opening
> > happens relatively late in the record process, use of the idx value
> > before the open will have become corrupted, so it is expected there
> > are latent bugs hidden behind this change - the change is best
> > effort. As the only vendor that has broken event names is ARM, this
> > will principally effect ARM users. They will also experience warning
> > messages like those above because of the uncore PMU advertising legacy
> > event names.
>
>
>
> >
> > Suggested-by: Arnaldo Carvalho de Melo <acme at kernel.org>
> > Signed-off-by: Ian Rogers <irogers at google.com>
> > ---
> >   tools/perf/builtin-record.c | 22 +++++++++++++++++-----
> >   1 file changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index f83252472921..7e99743f7e42 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -1364,6 +1364,7 @@ static int record__open(struct record *rec)
> >       struct perf_session *session = rec->session;
> >       struct record_opts *opts = &rec->opts;
> >       int rc = 0;
> > +     bool skipped = false;
> >
> >       evlist__for_each_entry(evlist, pos) {
> >   try_again:
> > @@ -1379,15 +1380,26 @@ static int record__open(struct record *rec)
> >                               pos = evlist__reset_weak_group(evlist, pos, true);
> >                               goto try_again;
> >                       }
> > -                     rc = -errno;
> >                       evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> > -                     ui__error("%s\n", msg);
> > -                     goto out;
> > +                     ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> > +                               evsel__name(pos), evsel__pmu_name(pos), msg);
> > +                     pos->skippable = true;
> > +                     skipped = true;
> > +             } else {
> > +                     pos->supported = true;
> >               }
> > -
> > -             pos->supported = true;
> >       }
> >
> > +     if (skipped) {
> > +             struct evsel *tmp;
> > +             int idx = 0;
> > +
> > +             evlist__for_each_entry_safe(evlist, tmp, pos) {
> > +                     if (pos->skippable)
> > +                             evlist__remove(evlist, pos);
> > +                     pos->core.idx = idx++;
> > +             }
> > +     }
> >       if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
> >               pr_warning(
> >   "WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
>



More information about the linux-riscv mailing list