[PATCH v8 2/2] drivers/perf: hisi: Add driver for HiSilicon PCIe PMU

Will Deacon will at kernel.org
Fri Aug 13 07:40:27 PDT 2021


On Wed, Aug 04, 2021 at 03:29:54PM +0800, liuqi (BA) wrote:
> 
> Hi Will,
> > Hmm, I was hoping that you would expose all the events as proper perf_events
> > and get rid of the subevents entirely.
> > 
> > Then userspace could do things like:
> > 
> >    // Count number of RX memory reads
> >    $ perf stat -e hisi_pcie0_0/rx_memory_read/
> > 
> >    // Count delay cycles
> >    $ perf stat -e hisi_pcie0_0/latency/
> > 
> >    // Count both of the above (events must be in the same group)
> >    $ perf stat -g -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
> > 
> > Note that in all three of these cases the hardware will be programmed in
> > the same way and both HISI_PCIE_CNT and HISI_PCIE_EXT_CNT are allocated!
> > 
> > So for example, doing this (i.e. without the '-g'):
> > 
> >    $ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
> > 
> > would fail because the first event would allocate both of the counters.
> 
> I'm confused with this situation when getting rid of subevent:
> 
> $ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
> 
> In this case, driver checks the relationship of "latency" and
> "rx_memory_read" in pmu->add() function and return a -EINVAL, but this seems
> lead to time division multiplexing.
> 
> 	if (event->pmu->add(event, PERF_EF_START)) {
> 		perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE);
> 		event->oncpu = -1;
> 		ret = -EAGAIN;
> 		goto out;
> 	}
> 	...
> out:
> 	perf_pmu_enable(event->pmu);
> 
> This result doesn't meet our expection, do I miss something here?

This is how perf works. If you don't want multiplexing, put the events in a
group. What's the problem with that?

Will



More information about the linux-arm-kernel mailing list