[PATCH V4 03/10] perf: Extend branch type classification

Peter Zijlstra peterz at infradead.org
Tue Mar 15 04:22:32 PDT 2022


On Tue, Mar 15, 2022 at 11:05:09AM +0530, Anshuman Khandual wrote:
> branch_entry.type now has ran out of space to accommodate more branch types
> classification. This will prevent perf branch stack implementation on arm64
> (via BRBE) to capture all available branch types. Extending this bit field
> i.e branch_entry.type [4 bits] is not an option as it will break user space
> ABI both for little and big endian perf tools.
> 
> Extend branch classification with a new field branch_entry.new_type via a
> new branch type PERF_BR_EXTEND_ABI in branch_entry.type. Perf tools which
> could decode PERF_BR_EXTEND_ABI, will then parse branch_entry.new_type as
> well.
> 
> branch_entry.new_type is a 4 bit field which can hold upto 16 branch types.
> The first three branch types will hold various generic page faults followed
> by five architecture specific branch types, which can be overridden by the
> platform for specific use cases. These architecture specific branch types
> gets overridden on arm64 platform for BRBE implementation.

> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 26d8f0b5ac0d..d29280adc3c4 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -255,9 +255,22 @@ enum {
>  	PERF_BR_IRQ		= 12,	/* irq */
>  	PERF_BR_SERROR		= 13,	/* system error */
>  	PERF_BR_NO_TX		= 14,	/* not in transaction */
> +	PERF_BR_EXTEND_ABI	= 15,	/* extend ABI */
>  	PERF_BR_MAX,
>  };


>  #define PERF_SAMPLE_BRANCH_PLM_ALL \
>  	(PERF_SAMPLE_BRANCH_USER|\
>  	 PERF_SAMPLE_BRANCH_KERNEL|\
> @@ -1372,7 +1385,8 @@ struct perf_branch_entry {
>  		abort:1,    /* transaction abort */
>  		cycles:16,  /* cycle count to last branch */
>  		type:4,     /* branch type */
> -		reserved:40;
> +		new_type:4, /* additional branch type */
> +		reserved:36;
>  };

Hurmpf... this will effectively give us 5 bits of space for the cost of
8, that seems... unfortunate.

Would something like:

		type:4,
		ext_type:4,
		reserved:36;

and have all software do:

	type = pbe->type | (pbe->ext_type << 4);

Then old software will only know about the old types. New software on
old kernels will add 4 0's, which is harmless, while new software on new
kernels will get 8 bytes of type.

Would that work?



More information about the linux-arm-kernel mailing list