[PATCH v4 4/9] libperf: Add libperf_evsel__mmap()

Jiri Olsa jolsa at redhat.com
Wed Nov 11 07:00:56 EST 2020


On Fri, Nov 06, 2020 at 03:56:11PM -0600, Rob Herring wrote:
> On Thu, Nov 5, 2020 at 4:41 PM Jiri Olsa <jolsa at redhat.com> wrote:
> >
> > On Thu, Nov 05, 2020 at 10:19:24AM -0600, Rob Herring wrote:
> >
> > SNIP
> >
> > > > > >
> > > > > > that maps page for each event, then perf_evsel__read
> > > > > > could go through the fast code, no?
> > > > >
> > > > > No, because we're not self-monitoring (pid == 0 and cpu == -1). With
> > > > > the following change:
> > > > >
> > > > > diff --git a/tools/lib/perf/tests/test-evsel.c
> > > > > b/tools/lib/perf/tests/test-evsel.c
> > > > > index eeca8203d73d..1fca9c121f7c 100644
> > > > > --- a/tools/lib/perf/tests/test-evsel.c
> > > > > +++ b/tools/lib/perf/tests/test-evsel.c
> > > > > @@ -17,6 +17,7 @@ static int test_stat_cpu(void)
> > > > >  {
> > > > >         struct perf_cpu_map *cpus;
> > > > >         struct perf_evsel *evsel;
> > > > > +       struct perf_event_mmap_page *pc;
> > > > >         struct perf_event_attr attr = {
> > > > >                 .type   = PERF_TYPE_SOFTWARE,
> > > > >                 .config = PERF_COUNT_SW_CPU_CLOCK,
> > > > > @@ -32,6 +33,15 @@ static int test_stat_cpu(void)
> > > > >         err = perf_evsel__open(evsel, cpus, NULL);
> > > > >         __T("failed to open evsel", err == 0);
> > > > >
> > > > > +       pc = perf_evsel__mmap(evsel, 0);
> > > > > +       __T("failed to mmap evsel", pc);
> > > > > +
> > > > > +#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
> > > > > +       __T("userspace counter access not supported", pc->cap_user_rdpmc);
> > > > > +       __T("userspace counter access not enabled", pc->index);
> > > > > +       __T("userspace counter width not set", pc->pmc_width >= 32);
> > > > > +#endif
> > > >
> > > > I'll need to check, I'm surprised this would depend on the way
> > > > you open the event
> > >
> > > Any more thoughts on this?
> >
> > sry I got stuck with other stuff.. I tried your change
> > and pc->cap_user_rdpmc is 0 because the test creates
> > software event, which does not support that
> 
> Sigh, yes, of course.
> 
> > when I change that to:
> >
> >         .type   = PERF_TYPE_HARDWARE,
> >         .config = PERF_COUNT_HW_CPU_CYCLES,
> >
> > I don't see any of those warning you added
> 
> So I've now implemented the per fd mmap. It seems to run and get some
> data, but for the above case the counts don't look right.
> 
> cpu0: count = 0x10883, ena = 0xbf42, run = 0xbf42
> cpu1: count = 0x1bc65, ena = 0xa278, run = 0xa278
> cpu2: count = 0x1fab2, ena = 0x91ea, run = 0x91ea
> cpu3: count = 0x23d61, ena = 0x81ac, run = 0x81ac
> cpu4: count = 0x2936a, ena = 0x7149, run = 0x7149
> cpu5: count = 0x2cd4e, ena = 0x634f, run = 0x634f
> cpu6: count = 0x3139f, ena = 0x53e7, run = 0x53e7
> cpu7: count = 0x35350, ena = 0x4690, run = 0x4690
> 
> For comparison, this is what I get using the slow path read():
> cpu0: count = 0x1c40, ena = 0x188b5, run = 0x188b5
> cpu1: count = 0x18e0, ena = 0x1b8f4, run = 0x1b8f4
> cpu2: count = 0x745e, ena = 0x1ab9e, run = 0x1ab9e
> cpu3: count = 0x2416, ena = 0x1a280, run = 0x1a280
> cpu4: count = 0x19c7, ena = 0x19b00, run = 0x19b00
> cpu5: count = 0x1737, ena = 0x19262, run = 0x19262
> cpu6: count = 0x11d0e, ena = 0x18944, run = 0x18944
> cpu7: count = 0x20dbe, ena = 0x181f4, run = 0x181f4

hum, could you please send/push changes with that test?
I can try it and check

jirka

> 
> The difference is we get a sequentially increasing count rather than 1
> random CPU (the one running the test) with a much higher count. That
> seems to me we're just reading the count for the calling process, not
> each CPU.
> 
> For this to work correctly, cap_user_rdpmc would have to be set only
> for the CPU's mmap that matches the calling process's CPU. I'm not
> sure whether that can be done. Even if it can, is it really worth
> doing so? You're accelerating reading an event on 1 out of N CPUs. And
> what do we do on every kernel up til now this won't work on? Another
> cap bit?
> 
> Rob
> 
> P.S. I did find one bug with all this. The shifts by pmc_width in the
> read seq need to be a signed count. This test happens to have raw
> counter values starting at 2^47.
> 




More information about the linux-arm-kernel mailing list