[PATCH v2] perf stat: Enable iostat mode for HiSilicon PCIe PMU
Arnaldo Carvalho de Melo
acme at kernel.org
Fri Feb 9 05:30:59 PST 2024
On Fri, Feb 09, 2024 at 10:59:04AM +0000, Robin Murphy wrote:
> On 2024-02-08 11:58 pm, Namhyung Kim wrote:
> > On Wed, Feb 7, 2024 at 7:29 PM Yicong Yang <yangyicong at huawei.com> wrote:
> > > From: Yicong Yang <yangyicong at hisilicon.com>
> > > Some HiSilicon platforms provide PCIe PMU devices for monitoring the
> > > throughput and latency of PCIe traffic. With the support of PCIe PMU
> > > we can enable the perf iostat mode.
> > Hmm.. so it only works for HiSilicon. What if users run it on a different
> > platform?
> Same thing as if they run it on an AMD or older Intel platform ;)
> > I think ARM should care about this.
> Arm don't make PCIe root ports, and there is no PMU standardisation between
> all the myriad different implementers and vendors of PCIe IP, so the best
> perf can reasonably do is simply support the particular PMUs that people
> want perf to support.
yeah, that will make perf more useful to more people, which is good.
And it is reusing something we did in the past for a similar mode on
Intel, from what I remember and by looking at:
---------------------------
commit f9ed693e8bc0e7de9eb766a3c7178590e8bb6cd5
Author: Alexander Antonov <alexander.antonov at linux.intel.com>
Date: Mon Apr 19 12:41:46 2021 +0300
perf stat: Enable iostat mode for x86 platforms
This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d39781d7fc ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
#EventCount * 4B / (1024 * 1024)
---------------------------
And tools/perf/perf-iostat.sh.
Right?
- Arnaldo
More information about the linux-arm-kernel
mailing list