[PATCH 19/25] perf/uapi: Extend data source fields
James Clark
james.clark at linaro.org
Thu Oct 9 03:00:23 PDT 2025
On 29/09/2025 5:37 pm, Leo Yan wrote:
> Arm CPUs introduce several new types of memory operations, like MTE tag
> accessing, system register access for nested virtualization, memcpy &
> memset, and Guarded Control Stack (GCS).
>
> For memory operation details, Arm SPE provides information like data
> (parallel) processing, floating-point, predicated, atomic, exclusive,
> acquire/release, gather/scatter, and conditional.
>
> This commit introduces a field 'mem_op_ext' for extended operation type.
> The extended operation type can be combined with the existed operation
> type to express a memory type, for examples, a PERF_MEM_OP_GCS type can
> be set along with PERF_MEM_OP_LOAD to present a load operation for
> GCS register access.
>
> Also use a field 'mem_aff' to store affiliate information.
>
> Signed-off-by: Leo Yan <leo.yan at arm.com>
> ---
> include/uapi/linux/perf_event.h | 28 ++++++++++++++++++++++++++--
> 1 file changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 78a362b8002776e5ce83a0d7816601638c61ecc6..51ab37d44ac31fcdc4bc919c14d5f97e560d9339 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1309,14 +1309,18 @@ union perf_mem_data_src {
> mem_snoopx : 2, /* Snoop mode, ext */
> mem_blk : 3, /* Access blocked */
> mem_hops : 3, /* Hop level */
> - mem_rsvd : 18;
> + mem_op_ext : 6, /* Extended type of opcode */
Why have a 6 bit field containing 1 bit per thing when you can have 6
named 1 bit fields? That way you don't have to do a load of bitwise
magic to access it and you don't need separate #defines.
Also, when you set these, you never set more than one bit so they're
exclusive. Would a 3 bit enum be better than a 6 bit bitfield in this case?
> + mem_aff : 8, /* Affiliate info */
> + mem_rsvd : 4;
> };
> };
> #elif defined(__BIG_ENDIAN_BITFIELD)
> union perf_mem_data_src {
> __u64 val;
> struct {
> - __u64 mem_rsvd : 18,
> + __u64 mem_rsvd : 4,
> + mem_aff : 8, /* Affiliate info */
> + mem_op_ext : 6, /* Extended type of opcode */
> mem_hops : 3, /* Hop level */
> mem_blk : 3, /* Access blocked */
> mem_snoopx : 2, /* Snoop mode, ext */
> @@ -1426,6 +1430,26 @@ union perf_mem_data_src {
> /* 5-7 available */
> #define PERF_MEM_HOPS_SHIFT 43
>
> +/* Extended type of memory opcode: */
> +#define PERF_MEM_EXT_OP_MTE_TAG 0x0001 /* MTE tag */
> +#define PERF_MEM_EXT_OP_NESTED_VIRT 0x0002 /* Nested virtualization */
> +#define PERF_MEM_EXT_OP_MEMCPY 0x0004 /* Memory copy */
> +#define PERF_MEM_EXT_OP_MEMSET 0x0008 /* Memory set */
> +#define PERF_MEM_EXT_OP_SIMD 0x0010 /* SIMD */
> +#define PERF_MEM_EXT_OP_GCS 0x0020 /* Guarded Control Stack */
> +#define PERF_MEM_EXT_OP_SHIFT 46
> +
> +/* Affiliate info */
"Affiliate info" doesn't really describe what these are supposed to be,
or why they are separate. Is it implying that they're always set in
addition to another flag? Like "details" or "category"?
Either way, I feel that limitation might be a bit strict for the generic
uapi, or it needs to be described in more detail. If we change this
field to be individual bits like the other one, then maybe we can drop
that it's a separate group and it's just a bunch of bits you can set
however you like.
> +#define PERF_MEM_AFF_DP 0x0001 /* Data processing */
> +#define PERF_MEM_AFF_FP 0x0002 /* Floating-point */
> +#define PERF_MEM_AFF_PRED 0x0004 /* Predicated */
> +#define PERF_MEM_AFF_ATOMIC 0x0008 /* Atomic */
> +#define PERF_MEM_AFF_EXCLUSIVE 0x0010 /* Exclusive */
> +#define PERF_MEM_AFF_AR 0x0020 /* Acquire/release */
> +#define PERF_MEM_AFF_SG 0x0040 /* Gather/Scatter */
> +#define PERF_MEM_AFF_CONDITIONAL 0x0080 /* Conditional */
> +#define PERF_MEM_AFF_SHIFT 52
> +
> #define PERF_MEM_S(a, s) \
> (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>
>
More information about the linux-arm-kernel
mailing list