[PATCH V13 - RESEND 06/10] arm64/perf: Enable branch stack events via FEAT_BRBE

Anshuman Khandual anshuman.khandual at arm.com
Tue Jul 25 23:26:14 PDT 2023


On 7/11/23 13:54, Anshuman Khandual wrote:
> +static void capture_brbe_flags(struct perf_branch_entry *entry, struct perf_event *event,
> +			       u64 brbinf)
> +{
> +	if (branch_sample_type(event))
> +		brbe_set_perf_entry_type(entry, brbinf);
> +
> +	if (!branch_sample_no_cycles(event))
> +		entry->cycles = brbe_get_cycles(brbinf);

Just revisiting the branch records cycle capture process in BRBE.

'cycles' field is a 16 bit field in struct perf_branch_entry.

struct perf_branch_entry {
        __u64   from;
        __u64   to;
        __u64   mispred:1,  /* target mispredicted */
                predicted:1,/* target predicted */
                in_tx:1,    /* in transaction */
                abort:1,    /* transaction abort */
                cycles:16,  /* cycle count to last branch */
                type:4,     /* branch type */
                spec:2,     /* branch speculation info */
                new_type:4, /* additional branch type */
                priv:3,     /* privilege level */
                reserved:31;
};

static inline int brbe_get_cycles(u64 brbinf)
{
	/*
	 * Captured cycle count is unknown and hence
	 * should not be passed on to the user space.
	 */
	if (brbinf & BRBINFx_EL1_CCU)
		return 0;

	return FIELD_GET(BRBINFx_EL1_CC_MASK, brbinf);
}

capture_brbe_flags()
{

	.....
	if (!branch_sample_no_cycles(event))
		entry->cycles = brbe_get_cycles(brbinf);
	.....

}

BRBINF<N>_EL1.CC[45:32] is a 14 bits field which gets assigned into
perf_branch_entry->cycles field which is 16 bits wide. Although the
cycle count representation here is mantissa and exponent based one.

CC bits[7:0] indicate the mantissa M
CC bits[13:8] indicate the exponent E

The actual cycle count is based on the following formula as the per
the spec [1], which needs to be derived from BRBINF<N>_EL1.CC field.

-------------------------------------------------------------------
The cycle count is expressed using the following function:

if IsZero(E) then UInt(M) else UInt('1':M:Zeros(UInt(E)-1))

If required, the cycle count is rounded to a multiple of 2(E-1) towards zero before being encoded.

A value of all ones in both the mantissa and exponent indicates the cycle count value exceeded the size of the cycle counter.
-------------------------------------------------------------------

Given that there is only 16 bits in perf_branch_entry structure for
cycles count, I assume that it needs to be derived in the user perf
tool (per the above calculation) before being interpreted ?

[1] https://developer.arm.com/documentation/ddi0601/2023-06/AArch64-Registers/BRBINF-n--EL1--Branch-Record-Buffer-Information-Register--n-?lang=en

- Anshuman





More information about the linux-arm-kernel mailing list