[PATCH v2] perf stat: Enable iostat mode for HiSilicon PCIe PMU

Arnaldo Carvalho de Melo acme at kernel.org
Fri Feb 9 05:30:59 PST 2024


On Fri, Feb 09, 2024 at 10:59:04AM +0000, Robin Murphy wrote:
> On 2024-02-08 11:58 pm, Namhyung Kim wrote:
> > On Wed, Feb 7, 2024 at 7:29 PM Yicong Yang <yangyicong at huawei.com> wrote:
> > > From: Yicong Yang <yangyicong at hisilicon.com>
> > > Some HiSilicon platforms provide PCIe PMU devices for monitoring the
> > > throughput and latency of PCIe traffic. With the support of PCIe PMU
> > > we can enable the perf iostat mode.

> > Hmm.. so it only works for HiSilicon.  What if users run it on a different
> > platform?
 
> Same thing as if they run it on an AMD or older Intel platform ;)
 
> >  I think ARM should care about this.
 
> Arm don't make PCIe root ports, and there is no PMU standardisation between
> all the myriad different implementers and vendors of PCIe IP, so the best
> perf can reasonably do is simply support the particular PMUs that people
> want perf to support.

yeah, that will make perf more useful to more people, which is good.

And it is reusing something we did in the past for a similar mode on
Intel, from what I remember and by looking at:

---------------------------
commit f9ed693e8bc0e7de9eb766a3c7178590e8bb6cd5
Author: Alexander Antonov <alexander.antonov at linux.intel.com>
Date:   Mon Apr 19 12:41:46 2021 +0300

    perf stat: Enable iostat mode for x86 platforms

    This functionality is based on recently introduced sysfs attributes for
    Intel® Xeon® Scalable processor family (code name Skylake-SP):

    Commit bb42b3d39781d7fc ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")

    Mode is intended to provide four I/O performance metrics in MB per each
    PCIe root port:

     - Inbound Read: I/O devices below root port read from the host memory
     - Inbound Write: I/O devices below root port write to the host memory
     - Outbound Read: CPU reads from I/O devices below root port
     - Outbound Write: CPU writes to I/O devices below root port

    Each metric requiries only one uncore event which increments at every 4B
    transfer in corresponding direction. The formulas to compute metrics
    are generic:
        #EventCount * 4B / (1024 * 1024)
---------------------------

And tools/perf/perf-iostat.sh.

Right?

- Arnaldo



More information about the linux-arm-kernel mailing list