[PATCH] perf stat: Enable iostat mode for HiSilicon PCIe PMU

Yicong Yang yangyicong at huawei.com
Tue Feb 6 23:02:23 PST 2024


On 2024/2/6 18:09, Jonathan Cameron wrote:
> On Tue, 23 Jan 2024 15:12:01 +0800
> Yicong Yang <yangyicong at huawei.com> wrote:
> 
>> From: Yicong Yang <yangyicong at hisilicon.com>
> 
> If you end up doing a v2, a few typos and grammar tweaks.
> A few other trivial comments inline on things that might improve readability
> a tiny bit.
> 

Thanks for the comments.

> 
>>
>> Some HiSilicon platforms provide PCIe PMU devices for monitoring the
>> throughout and latency of PCIe traffic. With the support of PCIe PMU
>> we can enable the perf iostat mode.
>>
>> The HiSilicon PCIe PMU can support measuring the throughout of certain
>> TLP types and of certian root port. Totally 6 metrics are provided in
> 
> certain
> 

ok.

>> the unit of MB:
>>
>> - Inbound MWR: The memory write TLPs from the devices downstream the root port
>> - Inbound MRD: The memory read TLPs from the devices downstream the root port
>> - Inbound CPL: The completion TLPs from the devices downstream the root port
>> - Outbound MWR: The memory write TLPs from the CPU to the downstream devices
>> - Outbound MRD: The memory read TLPs from the CPU to the downstream devices
>> - Outbound CPL: The completions TLPs from the CPU to the downstream devices
>>
>> Since the PMU measures the throughout with unit of DWords. So we need to
> throughput in DWords
> 

ok.

>> calculate the throughout in MB like:
>>   Count * 4B / 1024 / 1024
>>
>> Some of the display of the `perf iostat` will be like:
>> [root at localhost tmp]# ./perf iostat list
>> hisi_pcie0_core2<0000:40:00.0>
>> hisi_pcie2_core2<0000:5f:00.0>
>> hisi_pcie0_core1<0000:16:00.0>
>> hisi_pcie0_core1<0000:16:04.0>
>> [root at localhost tmp]# ./perf iostat --timeout 10000
>>
>>  Performance counter stats for 'system wide':
>>
>>     port              Inbound MWR(MB)      Inbound MRD(MB)      Inbound CPL(MB)     Outbound MWR(MB)     Outbound MRD(MB)     Outbound CPL(MB)
>> 0000:40:00.0                    0                    0                    0                    0                    0                    0
>> 0000:5f:00.0                    0                    0                    0                    0                    0                    0
>> 0000:16:00.0             16272.99               366.58                    0                15.09                    0             16156.85
>> 0000:16:04.0                    0                    0                    0                    0                    0                    0
>>
>>       10.008227512 seconds time elapsed
>>
>> [root at localhost tmp]# ./perf iostat 0000:16:00.0 -- fio -name=read
>> -numjobs=30 -filename=/dev/nvme0n1 -rw=rw -iodepth=128 -direct=1 -sync=0
>> -norandommap -group_reporting -runtime=10 -time_based -bs=64k
>>
>>  Performance counter stats for 'system wide':
>>
>>     port              Inbound MWR(MB)      Inbound MRD(MB)      Inbound CPL(MB)     Outbound MWR(MB)     Outbound MRD(MB)     Outbound CPL(MB)
>> 0000:40:00.0                    0                    0                    0                    0                    0                    0
>> 0000:5f:00.0                    0                    0                    0                    0                    0                    0
>> 0000:16:00.0             16314.30               371.22                    0                15.21                    0             16362.20
>> 0000:16:04.0                    0                    0                    0                    0                    0                    0
>>
>>       10.168561767 seconds time elapsed
>>
>>        0.465373000 seconds user
>>        1.952948000 seconds sys
>>
>> More information of the HiSilicon PCIe PMU can be found at
>> Documentation/admin-guide/perf/hisi-pcie-pmu.rst.
>>
>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>> ---
>>  tools/perf/arch/arm64/util/Build         |   1 +
>>  tools/perf/arch/arm64/util/hisi-iostat.c | 433 +++++++++++++++++++++++
>>  2 files changed, 434 insertions(+)
>>  create mode 100644 tools/perf/arch/arm64/util/hisi-iostat.c
>>
>> diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
>> index 78ef7115be3d..4e8dabf98b29 100644
>> --- a/tools/perf/arch/arm64/util/Build
>> +++ b/tools/perf/arch/arm64/util/Build
>> @@ -3,6 +3,7 @@ perf-y += machine.o
>>  perf-y += perf_regs.o
>>  perf-y += tsc.o
>>  perf-y += pmu.o
>> +perf-y += hisi-iostat.o
>>  perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
>>  perf-$(CONFIG_DWARF)     += dwarf-regs.o
>>  perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
>> diff --git a/tools/perf/arch/arm64/util/hisi-iostat.c b/tools/perf/arch/arm64/util/hisi-iostat.c
>> new file mode 100644
>> index 000000000000..418eebece184
>> --- /dev/null
>> +++ b/tools/perf/arch/arm64/util/hisi-iostat.c
>> @@ -0,0 +1,433 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * perf iostat support for HiSilicon PCIe PMU.
>> + * Partly derived from tools/perf/arch/x86/util/iostat.c.
>> + *
>> + * Copyright (c) 2024 HiSilicon Technologies Co., Ltd.
>> + * Author: Yicong Yang <yangyicong at hisilicon.com>
>> + */
>> +
>> +#include <api/fs/fs.h>
>> +#include <linux/err.h>
>> +#include <linux/zalloc.h>
>> +#include <linux/limits.h>
>> +#include <dirent.h>
>> +#include <stdio.h>
>> +#include <errno.h>
>> +#include <stdlib.h>
>> +
>> +#include "util/counts.h"
>> +#include "util/cpumap.h"
>> +#include "util/debug.h"
>> +#include "util/iostat.h"
>> +#include "util/pmu.h"
>> +
>> +#define PCI_DEFAULT_DOMAIN		0
>> +#define PCI_DEVICE_NAME_PATTERN		"%04x:%02hhx:%02hhx.%hhu"
>> +#define PCI_ROOT_BUS_DEVICES_PATH	"bus/pci/devices"
>> +
>> +static const char * const hisi_iostat_metrics[] = {
>> +	"Inbound MWR(MB)",
>> +	"Inbound MRD(MB)",
>> +	"Inbound CPL(MB)",
>> +	"Outbound MWR(MB)",
>> +	"Outbound MRD(MB)",
>> +	"Outbound CPL(MB)",
>> +};
>> +
>> +static const char * const hisi_iostat_cmd_template[] = {
> 
> Given this array and the one above have to remain in same order, perhaps
> an enum?  Would also remove need to have the comments via
> [in_bound_wr] = "....
> etc
> 

sure. Will use an enum to define both the metric event string and here
for cmd template.

> 
>> +	/* Inbound Memory Write */
>> +	"hisi_pcie%hu_core%hu/event=0x0104,port=0x%hx/",
>> +	/* Inbound Memory Read */
>> +	"hisi_pcie%hu_core%hu/event=0x0804,port=0x%hx/",
>> +	/* Inbound Memory Completion */
>> +	"hisi_pcie%hu_core%hu/event=0x2004,port=0x%hx/",
>> +	/* Outbound Memory Write */
>> +	"hisi_pcie%hu_core%hu/event=0x0105,port=0x%hx/",
>> +	/* Outbound Memory Read */
>> +	"hisi_pcie%hu_core%hu/event=0x0405,port=0x%hx/",
>> +	/* Outbound Memory Completion */
>> +	"hisi_pcie%hu_core%hu/event=0x1005,port=0x%hx/",
>> +};
>>
> 
> ...
> 
> 
> 
> 
>> +
>> +void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
>> +			 struct perf_stat_output_ctx *out)
>> +{
>> +	struct perf_counts_values *count;
>> +	const char *iostat_metric;
>> +	double iostat_value;
>> +
>> +	iostat_metric = hisi_iostat_metrics[evsel->core.idx % ARRAY_SIZE(hisi_iostat_metrics)];
>> +
>> +	/* We're using AGGR_GLOBAL so there's only one aggr counts aggr[0]. */
>> +	count = &evsel->stats->aggr[0].counts;
>> +
>> +	/* The counts has been scaled, we can use it directly. */
>> +	iostat_value = (double)count->val;
>> +
>> +	/*
>> +	 * Display two digits after decimal point for better accuracy if the
>> +	 * value is non-zero.
>> +	 */
>> +	out->print_metric(config, out->ctx, NULL,
>> +			  iostat_value > 0 ? "%8.2f" : "%8.0f",
>> +			  iostat_metric, iostat_value / (256 * 1024));
> 
> I assume that 256 * 1024 is MiB/sizeof(dword)?  Perhaps express it as
> that to make the reasoning clearer?
> 
> 

Yes you're right. I mentioned it in the commit. Will add some comment it to
make it clearer here.

Thanks,
Yicong





More information about the linux-arm-kernel mailing list