[PATCH v1 3/4] perf record: Skip don't fail for events that don't open
Leo Yan
leo.yan at arm.com
Tue Nov 12 11:53:43 PST 2024
On Sat, Oct 26, 2024 at 05:17:57AM -0700, Ian Rogers wrote:
>
> Whilst for many tools it is an expected behavior that failure to open
> a perf event is a failure, ARM decided to name PMU events the same as
> legacy events and then failed to rename such events on a server uncore
> SLC PMU. As perf's default behavior when no PMU is specified is to
> open the event on all PMUs that advertise/"have" the event, this
> yielded failures when trying to make the priority of legacy and
> sysfs/json events uniform - something requested by RISC-V and ARM. A
> legacy event user on ARM hardware may find their event opened on an
> uncore PMU which for perf record will fail. Arnaldo suggested skipping
> such events which this patch implements. Rather than have the skipping
> conditional on running on ARM, the skipping is done on all
> architectures as such a fundamental behavioral difference could lead
> to problems with tools built/depending on perf.
>
> An example of perf record failing to open events on x86 is:
> ```
> $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> Error:
> Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> "dmesg | grep -i perf" may provide additional information.
>
> Error:
> Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> "dmesg | grep -i perf" may provide additional information.
>
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
>
> $ perf report --stats
> Aggregated stats:
> TOTAL events: 17255
> MMAP events: 284 ( 1.6%)
> COMM events: 1961 (11.4%)
> EXIT events: 1 ( 0.0%)
> FORK events: 1960 (11.4%)
> SAMPLE events: 87 ( 0.5%)
> MMAP2 events: 12836 (74.4%)
> KSYMBOL events: 83 ( 0.5%)
> BPF_EVENT events: 36 ( 0.2%)
> FINISHED_ROUND events: 2 ( 0.0%)
> ID_INDEX events: 1 ( 0.0%)
> THREAD_MAP events: 1 ( 0.0%)
> CPU_MAP events: 1 ( 0.0%)
> TIME_CONV events: 1 ( 0.0%)
> FINISHED_INIT events: 1 ( 0.0%)
> cycles stats:
> SAMPLE events: 87
> ```
Thanks for James reminding me. Tested on AVA platform:
# tree /sys/bus/event_source/devices/arm_dsu_*/events
...
/sys/bus/event_source/devices/arm_dsu_9/events
├── bus_access
├── bus_cycles
├── cycles
├── l3d_cache
├── l3d_cache_allocate
├── l3d_cache_refill
├── l3d_cache_wb
└── memory_error
# ./perf record -- sleep 0.1
Error:
Failure to open event 'cycles:PH' on PMU 'arm_dsu_0' which will be
removed.
cycles:PH: PMU Hardware doesn't support sampling/overflow-interrupts.
Try 'perf stat'
Error:
Failure to open event 'cycles:PH' on PMU 'arm_dsu_1' which will be
removed.
cycles:PH: PMU Hardware doesn't support sampling/overflow-interrupts.
Try 'perf stat'
...
Error:
Failure to open event 'cycles:PH' on PMU 'arm_dsu_15' which will be
removed.
cycles:PH: PMU Hardware doesn't support sampling/overflow-interrupts.
Try 'perf stat'
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data (8 samples) ]
# ./perf report --stats
Aggregated stats:
TOTAL events: 67
MMAP events: 40 (59.7%)
COMM events: 1 ( 1.5%)
SAMPLE events: 8 (11.9%)
KSYMBOL events: 6 ( 9.0%)
BPF_EVENT events: 6 ( 9.0%)
FINISHED_ROUND events: 1 ( 1.5%)
ID_INDEX events: 1 ( 1.5%)
THREAD_MAP events: 1 ( 1.5%)
CPU_MAP events: 1 ( 1.5%)
TIME_CONV events: 1 ( 1.5%)
FINISHED_INIT events: 1 ( 1.5%)
cycles:P stats:
SAMPLE events: 8
# ./perf stat -- sleep 0.1
Performance counter stats for 'sleep 0.1':
0.87 msec task-clock # 0.009 CPUs utilized
1 context-switches # 1.148 K/sec
0 cpu-migrations # 0.000 /sec
52 page-faults # 59.685 K/sec
877,835 instructions # 1.14 insn per cycle
# 0.25 stalled cycles per insn
772,102 cycles # 886.210 M/sec
191,914 stalled-cycles-frontend # 24.86% frontend cycles idle
219,183 stalled-cycles-backend # 28.39% backend cycles idle
184,099 branches # 211.307 M/sec
8,548 branch-misses # 4.64% of all branches
0.101623529 seconds time elapsed
0.001645000 seconds user
0.000000000 seconds sys
Tested-by: Leo Yan <leo.yan at arm.com>
More information about the linux-riscv
mailing list