[PATCH V4 03/10] perf: Extend branch type classification

Peter Zijlstra peterz at infradead.org
Wed Mar 16 05:20:01 PDT 2022


On Tue, Mar 15, 2022 at 01:06:42PM +0000, Robin Murphy wrote:
> On 2022-03-15 11:22, Peter Zijlstra wrote:
> > On Tue, Mar 15, 2022 at 11:05:09AM +0530, Anshuman Khandual wrote:
> > > branch_entry.type now has ran out of space to accommodate more branch types
> > > classification. This will prevent perf branch stack implementation on arm64
> > > (via BRBE) to capture all available branch types. Extending this bit field
> > > i.e branch_entry.type [4 bits] is not an option as it will break user space
> > > ABI both for little and big endian perf tools.
> > > 
> > > Extend branch classification with a new field branch_entry.new_type via a
> > > new branch type PERF_BR_EXTEND_ABI in branch_entry.type. Perf tools which
> > > could decode PERF_BR_EXTEND_ABI, will then parse branch_entry.new_type as
> > > well.
> > > 
> > > branch_entry.new_type is a 4 bit field which can hold upto 16 branch types.
> > > The first three branch types will hold various generic page faults followed
> > > by five architecture specific branch types, which can be overridden by the
> > > platform for specific use cases. These architecture specific branch types
> > > gets overridden on arm64 platform for BRBE implementation.
> > 
> > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > > index 26d8f0b5ac0d..d29280adc3c4 100644
> > > --- a/include/uapi/linux/perf_event.h
> > > +++ b/include/uapi/linux/perf_event.h
> > > @@ -255,9 +255,22 @@ enum {
> > >   	PERF_BR_IRQ		= 12,	/* irq */
> > >   	PERF_BR_SERROR		= 13,	/* system error */
> > >   	PERF_BR_NO_TX		= 14,	/* not in transaction */
> > > +	PERF_BR_EXTEND_ABI	= 15,	/* extend ABI */
> > >   	PERF_BR_MAX,
> > >   };
> > 
> > 
> > >   #define PERF_SAMPLE_BRANCH_PLM_ALL \
> > >   	(PERF_SAMPLE_BRANCH_USER|\
> > >   	 PERF_SAMPLE_BRANCH_KERNEL|\
> > > @@ -1372,7 +1385,8 @@ struct perf_branch_entry {
> > >   		abort:1,    /* transaction abort */
> > >   		cycles:16,  /* cycle count to last branch */
> > >   		type:4,     /* branch type */
> > > -		reserved:40;
> > > +		new_type:4, /* additional branch type */
> > > +		reserved:36;
> > >   };
> > 
> > Hurmpf... this will effectively give us 5 bits of space for the cost of
> > 8, that seems... unfortunate.
> > 
> > Would something like:
> > 
> > 		type:4,
> > 		ext_type:4,
> > 		reserved:36;
> > 
> > and have all software do:
> > 
> > 	type = pbe->type | (pbe->ext_type << 4);
> > 
> > Then old software will only know about the old types. New software on
> > old kernels will add 4 0's, which is harmless, while new software on new
> > kernels will get 8 bytes of type.
> > 
> > Would that work?
> 
> Depends how bad the effects of aliasing in existing software would be, I
> guess - e.g. new kernel outputs type 0x23 which software then interprets as
> 0x3 since it doesn't know about the extended bits. I'm guessing that's more
> likely "confusing to the user" than "catastrophically fatal", but it might
> still matter.
> 
> If software had an explicit opt-in to receiving extended types when
> requesting branch sampling in the first place we could avoid that worry, but
> then we'd need some additional complexity to sanitise records depending on
> that option :/

Bah.. I see.. One option is PERF_SAMPLE_BRANCH_STACK2, but yes, yuck.



More information about the linux-arm-kernel mailing list