[PATCH v2 0/3] Enable the profiling of EL0&1 translation regime of ARM SPE

Yicong Yang yangyicong at huawei.com
Wed Nov 29 23:46:06 PST 2023


From: Yicong Yang <yangyicong at hisilicon.com>

When using the SPE, it's noticed the operation of a VM is not profiled at
all on a VHE enabled host. This is because the driver's using
PMSCR_EL1.{E1SPE, E0SPE} to enable the profiling. On a VHE enabled host,
we're actually setting the PMSCR_EL2.{E2SPE, E0SPE}. This will enable the
profiling of EL2 and EL0 in a EL2&0 translation regime. However the VM
is using a EL1&0 translation regime, so it's not profiled. We can enable
the profiling of EL1&0 translation regime by setting PMSCR_EL12.{E1SPE, E0SPE}
from EL2.

This patch adds the support to this by:
- Add the sysreg definition of PMSCR_EL12
- Factor the code to allow extension
- Enable the profiling of EL1&0 and complete perf's exclude_* option

Tests have been done on VHE and non-VHE host with
`perf record -e {arm_spe_0//}:u|k|h|G|h`
and results shows as expected.

The sample from El0 in the VM will be like (generated by `perf report -D`):
.  000003b8:  b0 48 eb 12 ab ff ff 00 80                      PC 0xffffab12eb48 el0 ns=1
.  000003c1:  99 07 00                                        LAT 7 ISSUE
.  000003c4:  98 08 00                                        LAT 8 TOT
.  000003c7:  62 42 00 00 00                                  EV RETIRED NOT-TAKEN
.  000003cc:  4a 01                                           B COND
.  000003ce:  00 00                                           PAD
.  000003d0:  b1 4c eb 12 ab ff ff 00 80                      TGT 0xffffab12eb4c el0 ns=1
.  000003d9:  00 00 00                                        PAD
.  000003dc:  64 50 01 00 00                                  CONTEXT 0x150 el1
.  000003e1:  65 3c 11 00 00                                  CONTEXT 0x113c el2
.  000003e6:  00 00 00 00 00                                  PAD
.  000003eb:  71 aa aa 00 fd 06 00 00 00                      TS 30014483114

Change since v1:
- Add tag by Mark and ordered PMSCR_EL12 by address as suggested. By now at least
  PMSCR_EL12 keeps order among its *_EL12 siblings in sysreg.
Link: https://lore.kernel.org/linux-arm-kernel/20231122084602.53914-1-yangyicong@huawei.com/

Yicong Yang (3):
  arm64/sysreg: Add PMSCR_EL12 and factor out the common fields
  perf: arm_spe: Factor out PMSCR set/clear operations
  perf: arm_spe: Enable the profiling of EL0&1 translation regime

 arch/arm64/tools/sysreg    | 10 ++++-
 drivers/perf/arm_spe_pmu.c | 76 ++++++++++++++++++++++++++++----------
 2 files changed, 65 insertions(+), 21 deletions(-)

-- 
2.24.0




More information about the linux-arm-kernel mailing list