[PATCH 2/2] perf: arm_pmuv3: Don't use PMCCNTR_EL0 on SMT cores
Yicong Yang
yangyicong at huawei.com
Tue Aug 12 01:08:30 PDT 2025
From: Yicong Yang <yangyicong at hisilicon.com>
CPU_CYCLES is expected to count the logical CPU (PE) clock. Currently it's
preferred to use PMCCNTR_EL0 for counting CPU_CYCLES, but it'll count
processor clock rather than the PE clock (ARM DDI0487 L.b D13.1.3) if
one of the SMT siblings is not idle on a multi-threaded implementation.
So don't use it on SMT cores.
When counting cycles on SMT CPU 2-3 and CPU 3 is idle, without this
patch we'll get:
[root at client1 tmp]# perf stat -e cycles -A -C 2-3 -- stress-ng -c 1
--taskset 2 --timeout 1
[...]
Performance counter stats for 'CPU(s) 2-3':
CPU2 2880457316 cycles
CPU3 2880459810 cycles
1.254688470 seconds time elapsed
With this patch the idle state of CPU3 is observed as expected:
[root at client1 ~]# perf stat -e cycles -A -C 2-3 -- stress-ng -c 1
--taskset 2 --timeout 1
[...]
Performance counter stats for 'CPU(s) 2-3':
CPU2 2558580492 cycles
CPU3 305749 cycles
1.113626410 seconds time elapsed
Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
---
drivers/perf/arm_pmuv3.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 95c899d07df5..ed3149632b71 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -1002,6 +1002,15 @@ static bool armv8pmu_can_use_pmccntr(struct pmu_hw_events *cpuc,
if (has_branch_stack(event))
return false;
+ /*
+ * The PMCCNTR_EL0 increments from the processor clock rather than
+ * the PE clock (ARM DDI0487 L.b D13.1.3) which means it'll continue
+ * counting on a WFI PE if one of its SMT silbing is not idle on a
+ * multi-threaded implementation. So don't use it on SMT cores.
+ */
+ if (cpumask_weight(topology_sibling_cpumask(smp_processor_id())) > 1)
+ return false;
+
return true;
}
--
2.24.0
More information about the linux-arm-kernel
mailing list