[PATCH v4 4/9] libperf: Add libperf_evsel__mmap()

Mon Oct 19 16:15:41 EDT 2020

On Fri, Oct 16, 2020 at 04:39:15PM -0500, Rob Herring wrote:
> On Wed, Oct 14, 2020 at 6:05 AM Jiri Olsa <jolsa at redhat.com> wrote:
> >
> > On Thu, Oct 01, 2020 at 09:01:11AM -0500, Rob Herring wrote:
> >
> > SNIP
> >
> > >
> > > +void *perf_evsel__mmap(struct perf_evsel *evsel, int pages)
> > > +{
> > > +     int ret;
> > > +     struct perf_mmap *map;
> > > +     struct perf_mmap_param mp = {
> > > +             .prot = PROT_READ | PROT_WRITE,
> > > +     };
> > > +
> > > +     if (FD(evsel, 0, 0) < 0)
> > > +             return NULL;
> > > +
> > > +     mp.mask = (pages * page_size) - 1;
> > > +
> > > +     map = zalloc(sizeof(*map));
> > > +     if (!map)
> > > +             return NULL;
> > > +
> > > +     perf_mmap__init(map, NULL, false, NULL);
> > > +
> > > +     ret = perf_mmap__mmap(map, &mp, FD(evsel, 0, 0), 0);
> >
> > hum, so you map event for FD(0,0) but later in perf_evsel__read
> > you allow to read any cpu/thread combination ending up reading
> > data from FD(0,0) map:
> >
> >         int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
> >                              struct perf_counts_values *count)
> >         {
> >                 size_t size = perf_evsel__read_size(evsel);
> >
> >                 memset(count, 0, sizeof(*count));
> >
> >                 if (FD(evsel, cpu, thread) < 0)
> >                         return -EINVAL;
> >
> >                 if (evsel->mmap && !perf_mmap__read_self(evsel->mmap, count))
> >                         return 0;
> >
> >
> > I think we should either check cpu == 0, thread == 0, or make it
> > general and store perf_evsel::mmap in xyarray as we do for fds
> 
> The mmapped read will actually fail and then we fallback here. My main
> concern though is adding more overhead on a feature that's meant to be
> low overhead (granted, it's not much). Maybe we could add checks on
> the mmap that we've opened the event with pid == 0 and cpu == -1 (so
> only 1 FD)?

but then you limit this just for single fd.. having mmap as xyarray
would not be that bad and perf_evsel__mmap will call perf_mmap__mmap
for each defined cpu/thread .. so it depends on user how fast this
will be - how many maps needs to be created/mmaped

jirka