[PATCH v6 6/8] perf cs-etm: Filter synthesized branch samples

Leo Yan leo.yan at arm.com
Mon Jun 8 04:28:34 PDT 2026


On Thu, Jun 04, 2026 at 03:42:32PM +0100, James Clark wrote:

[...]

> > @@ -3442,6 +3447,16 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
> >   		etm->synth_opts.thread_stack = session->itrace_synth_opts->thread_stack;
> >   	}
> > +	if (etm->synth_opts.calls)
> > +		etm->branches_filter |= PERF_IP_FLAG_CALL |
> > +					PERF_IP_FLAG_TRACE_BEGIN |
> > +					PERF_IP_FLAG_TRACE_END;
> > +
> > +	if (etm->synth_opts.returns)
> > +		etm->branches_filter |= PERF_IP_FLAG_RETURN |
> > +					PERF_IP_FLAG_TRACE_BEGIN |
> > +					PERF_IP_FLAG_TRACE_END;
> > +
> 
> This changes the default "perf script" output quite significantly and will
> possibly break people's workflows. synth_opts.calls is true by default but
> synth_opts.returns is false so we lose all the returns that we used to have.
> Not sure if the new behavior is more consistent with other tools so we can
> justify changing it? Personally I think including returns by default made
> more sense, and it's a more literal representation of the flow.

Makes sense. I will add below chunk to enable return events for default
option:

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index ab3aa76dddb3..bd9eb794cc07 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -3541,6 +3541,14 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
                                session->itrace_synth_opts->default_no_sample);
                etm->synth_opts.callchain = false;
                etm->synth_opts.thread_stack = session->itrace_synth_opts->thread_stack;
+
+               /*
+                * By default, only call events are enabled but no return
+                * events. Enable return events to better represent the
+                * execution flow.
+                */
+               if (etm->synth_opts.calls)
+                       etm->synth_opts.returns = true;
        }

> 
> itrace.txt says the default is "all events i.e. the same as
> --itrace=iybxwpe", but I thought the default was branches? At least for
> Coresight it is, so I'm a bit confused.

"--itrace=iybxwpe" would be used for "perf report" command, the doc also
mentions "--itrace=ce" for "perf script" specific.

Thanks,
Leo



More information about the linux-arm-kernel mailing list