[PATCH v4 4/9] libperf: Add libperf_evsel__mmap()

Rob Herring robh at kernel.org
Tue Oct 20 10:38:13 EDT 2020


On Mon, Oct 19, 2020 at 3:15 PM Jiri Olsa <jolsa at redhat.com> wrote:
>
> On Fri, Oct 16, 2020 at 04:39:15PM -0500, Rob Herring wrote:
> > On Wed, Oct 14, 2020 at 6:05 AM Jiri Olsa <jolsa at redhat.com> wrote:
> > >
> > > On Thu, Oct 01, 2020 at 09:01:11AM -0500, Rob Herring wrote:
> > >
> > > SNIP
> > >
> > > >
> > > > +void *perf_evsel__mmap(struct perf_evsel *evsel, int pages)
> > > > +{
> > > > +     int ret;
> > > > +     struct perf_mmap *map;
> > > > +     struct perf_mmap_param mp = {
> > > > +             .prot = PROT_READ | PROT_WRITE,
> > > > +     };
> > > > +
> > > > +     if (FD(evsel, 0, 0) < 0)
> > > > +             return NULL;
> > > > +
> > > > +     mp.mask = (pages * page_size) - 1;
> > > > +
> > > > +     map = zalloc(sizeof(*map));
> > > > +     if (!map)
> > > > +             return NULL;
> > > > +
> > > > +     perf_mmap__init(map, NULL, false, NULL);
> > > > +
> > > > +     ret = perf_mmap__mmap(map, &mp, FD(evsel, 0, 0), 0);
> > >
> > > hum, so you map event for FD(0,0) but later in perf_evsel__read
> > > you allow to read any cpu/thread combination ending up reading
> > > data from FD(0,0) map:
> > >
> > >         int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> > >                              struct perf_counts_values *count)
> > >         {
> > >                 size_t size = perf_evsel__read_size(evsel);
> > >
> > >                 memset(count, 0, sizeof(*count));
> > >
> > >                 if (FD(evsel, cpu, thread) < 0)
> > >                         return -EINVAL;
> > >
> > >                 if (evsel->mmap && !perf_mmap__read_self(evsel->mmap, count))
> > >                         return 0;
> > >
> > >
> > > I think we should either check cpu == 0, thread == 0, or make it
> > > general and store perf_evsel::mmap in xyarray as we do for fds
> >
> > The mmapped read will actually fail and then we fallback here. My main
> > concern though is adding more overhead on a feature that's meant to be
> > low overhead (granted, it's not much). Maybe we could add checks on
> > the mmap that we've opened the event with pid == 0 and cpu == -1 (so
> > only 1 FD)?
>
> but then you limit this just for single fd.. having mmap as xyarray
> would not be that bad and perf_evsel__mmap will call perf_mmap__mmap
> for each defined cpu/thread .. so it depends on user how fast this
> will be - how many maps needs to be created/mmaped

Given userspace access fails for anything other than the calling
thread and all cpus, how would more than 1 mmap be useful here?

If we did want multiple mmaps, wouldn't we just use the evlist API in
that case? It already does all that.

Rob



More information about the linux-arm-kernel mailing list