[PATCH v8 2/2] drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
Will Deacon
will at kernel.org
Fri Aug 13 07:40:27 PDT 2021
On Wed, Aug 04, 2021 at 03:29:54PM +0800, liuqi (BA) wrote:
>
> Hi Will,
> > Hmm, I was hoping that you would expose all the events as proper perf_events
> > and get rid of the subevents entirely.
> >
> > Then userspace could do things like:
> >
> > // Count number of RX memory reads
> > $ perf stat -e hisi_pcie0_0/rx_memory_read/
> >
> > // Count delay cycles
> > $ perf stat -e hisi_pcie0_0/latency/
> >
> > // Count both of the above (events must be in the same group)
> > $ perf stat -g -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
> >
> > Note that in all three of these cases the hardware will be programmed in
> > the same way and both HISI_PCIE_CNT and HISI_PCIE_EXT_CNT are allocated!
> >
> > So for example, doing this (i.e. without the '-g'):
> >
> > $ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
> >
> > would fail because the first event would allocate both of the counters.
>
> I'm confused with this situation when getting rid of subevent:
>
> $ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
>
> In this case, driver checks the relationship of "latency" and
> "rx_memory_read" in pmu->add() function and return a -EINVAL, but this seems
> lead to time division multiplexing.
>
> if (event->pmu->add(event, PERF_EF_START)) {
> perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE);
> event->oncpu = -1;
> ret = -EAGAIN;
> goto out;
> }
> ...
> out:
> perf_pmu_enable(event->pmu);
>
> This result doesn't meet our expection, do I miss something here?
This is how perf works. If you don't want multiplexing, put the events in a
group. What's the problem with that?
Will
More information about the linux-arm-kernel
mailing list