[PATCH 2/2] perf arm-spe: Parse more SPE fields and store source
Leo Yan
leo.yan at linaro.org
Fri Feb 11 20:19:27 PST 2022
Hi German, Ali,
On Fri, Feb 11, 2022 at 04:31:40PM +0000, German Gomez wrote:
> Hi Ali,
>
> On 28/01/2022 21:02, Ali Saidi wrote:
> > Hi German,
> >
> > On 28/01/2022 19:20, German Gomez wrote:
> >> Hi Ali,
> >>
> >> [...]
> >>> };
> >>>
> >>> enum arm_spe_op_type {
> >>> ARM_SPE_LD = 1 << 0,
> >>> ARM_SPE_ST = 1 << 1,
> >>> + ARM_SPE_LDST_EXCL = 1 << 2,
> >>> + ARM_SPE_LDST_ATOMIC = 1 << 3,
> >>> + ARM_SPE_LDST_ACQREL = 1 << 4,
>
> Wondering if we can store this in perf_sample->flags. The values are
> defined in "util/event.h" (PERF_IP_*). Maybe we can extend it to allow
> doing "sample->flags = PERF_LDST_FLAG_LD | PERF_LDST_FLAG_ATOMIC" and
> such.
>
> @Leo do you think that could work?
Let's step back a bit and divide the decoding flow into two parts:
backend and frontend.
For the backend part, we decode the SPE hardware trace data and
generate the SPE record in the file
util/arm-spe-decoder/arm-spe-decoder.c. As we want to support
complete operation types, we can extend arm_spe_op_type as below:
enum arm_spe_op_type {
/* First level operation type */
ARM_SPE_OP_OTHER = 1 << 0,
ARM_SPE_OP_LDST = 1 << 1,
ARM_SPE_OP_BRANCH_ERET = 1 << 2,
/* Second level operation type for OTHER */
ARM_SPE_OP_SVE_OTHER = 1 << 16,
ARM_SPE_OP_SVE_FP = 1 << 17,
ARM_SPE_OP_SVE_PRED = 1 << 18,
/* Second level operation type for LDST */
ARM_SPE_OP_LD = 1 << 16,
ARM_SPE_OP_ST = 1 << 17,
ARM_SPE_OP_ATOMIC = 1 << 18,
ARM_SPE_OP_EXCL = 1 << 19,
ARM_SPE_OP_AR = 1 << 20,
ARM_SPE_OP_SIMD_FP = 1 << 21,
ARM_SPE_OP_GP_REG = 1 << 22,
ARM_SPE_OP_UNSPEC_REG = 1 << 23,
ARM_SPE_OP_NV_SYSREG = 1 << 24,
ARM_SPE_OP_SVE_PRED = 1 << 25,
ARM_SPE_OP_SVE_SG = 1 << 26,
/* Second level operation type for BRANCH_ERET */
ARM_SPE_OP_BR_COND = 1 << 16,
ARM_SPE_OP_BR_INDIRECT = 1 << 17,
};
IIUC, Ali suggested to directly reuse packet format to express
operation type and don't need to redefine the operation types like
above structure arm_spe_op_type. I personally bias to convert the raw
packet format to more readable format, a benefit is this would give
us more readable code.
For the frontend, we need to convert Arm SPE record to samples.
We can explore two fields: sample::flags and sample::data_src, for
load/store related operations, I perfer we can fill more complete
info in field sample::data_src and extend the definitions for
perf_mem_data_src; for branch operations, we can fill sample::flags.
So I am just wandering if we can set the field
sample::data_src::mem_lock for atomic operations, like:
data_src.mem_op = PERF_MEM_OP_LOAD;
data_src.mem_lock = PERF_MEM_LOCK_ATOMIC;
The field "mem_lock" is only two bits, we can consider to extend the
structure with an extra filed "mem_lock_ext" if it cannot meet our
requirement.
> >>> + ARM_SPE_BR = 1 << 5,
> >>> + ARM_SPE_BR_COND = 1 << 6,
> >>> + ARM_SPE_BR_IND = 1 << 7,
>
> Seems like we can store BR_COND in the existing "branch-miss" event
> (--itrace=b) with:
>
> sample->flags = PERF_IP_FLAG_BRANCH;
> sample->flags |= PERF_IP_FLAG_CONDITIONAL;
> and/or
> sample->flags |= PERF_IP_FLAG_INDIRECT;
>
> PERF_IP_FLAG_INDIRECT doesn't exist yet but we can probably add it.
Yes, for branch samples, this makes sense for me.
Thanks,
Leo
More information about the linux-arm-kernel
mailing list