[PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps

Leo Yan leo.yan at arm.com
Mon Jan 19 03:15:09 PST 2026


On Mon, Jan 19, 2026 at 10:18:35AM +0000, Coresight ML wrote:
> The ETM decoder incorrectly assumed that auxtrace queue indices were
> equivalent to CPU number. This assumption is used for inserting records
> into the queue, and for fetching queues when given a CPU number. This
> assumption held when Perf always opened a dummy event on every CPU, even
> if the user provided a subset of CPUs on the commandline, resulting in
> the indices aligning.
> 
> For example:
> 
>   # event : name = cs_etm//u, , id = { 2451, 2452 }, type = 11 (cs_etm), size = 136, config = 0x4010, { sample_period, samp>
>   # event : name = dummy:u, , id = { 2453, 2454, 2455, 2456 }, type = 1 (PERF_TYPE_SOFTWARE), size = 136, config = 0x9 (PER>
> 
>   0 0 0x200 [0xd0]: PERF_RECORD_ID_INDEX nr: 6
>   ... id: 2451  idx: 2  cpu: 2  tid: -1
>   ... id: 2452  idx: 3  cpu: 3  tid: -1
>   ... id: 2453  idx: 0  cpu: 0  tid: -1
>   ... id: 2454  idx: 1  cpu: 1  tid: -1
>   ... id: 2455  idx: 2  cpu: 2  tid: -1
>   ... id: 2456  idx: 3  cpu: 3  tid: -1
> 
> Since commit 811082e4b668 ("perf parse-events: Support user CPUs mixed
> with threads/processes") the dummy event no longer behaves in this way,
> making the ETM event indices start from 0 on the first CPU recorded
> regardless of its ID:
> 
>   # event : name = cs_etm//u, , id = { 771, 772 }, type = 11 (cs_etm), size = 144, config = 0x4010, { sample_period, sample>
>   # event : name = dummy:u, , id = { 773, 774 }, type = 1 (PERF_TYPE_SOFTWARE), size = 144, config = 0x9 (PERF_COUNT_SW_DUM>
> 
>   0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
>   ... id: 771  idx: 0  cpu: 2  tid: -1
>   ... id: 772  idx: 1  cpu: 3  tid: -1
>   ... id: 773  idx: 0  cpu: 2  tid: -1
>   ... id: 774  idx: 1  cpu: 3  tid: -1

Seems to me that this patch works around the issue by using the CPU ID
instead, but event->auxtrace.idx is broken.

Should we store the correct index in event->auxtrace.idx (e.g., in the
__perf_event__synthesize_id_index()) ?

Thanks,
Leo

> This causes the following segfault when decoding:
> 
>   $ perf record -e cs_etm//u -C 2,3 -- true
>   $ perf report
> 
>   perf: Segmentation fault
>   -------- backtrace --------
>   #0 0xaaaabf9fd020 in ui__signal_backtrace setup.c:110
>   #1 0xffffab5c7930 in __kernel_rt_sigreturn [vdso][930]
>   #2 0xaaaabfb68d30 in cs_etm_decoder__reset cs-etm-decoder.c:85
>   #3 0xaaaabfb65930 in cs_etm__get_data_block cs-etm.c:2032
>   #4 0xaaaabfb666fc in cs_etm__run_per_cpu_timeless_decoder cs-etm.c:2551
>   #5 0xaaaabfb6692c in (cs_etm__process_timeless_queues cs-etm.c:2612
>   #6 0xaaaabfb63390 in cs_etm__flush_events cs-etm.c:921
>   #7 0xaaaabfb324c0 in auxtrace__flush_events auxtrace.c:2915
>   #8 0xaaaabfaac378 in __perf_session__process_events session.c:2285
>   #9 0xaaaabfaacc9c in perf_session__process_events session.c:2442
>   #10 0xaaaabf8d3d90 in __cmd_report builtin-report.c:1085
>   #11 0xaaaabf8d6944 in cmd_report builtin-report.c:1866
>   #12 0xaaaabf95ebfc in run_builtin perf.c:351
>   #13 0xaaaabf95eeb0 in handle_internal_command perf.c:404
>   #14 0xaaaabf95f068 in run_argv perf.c:451
>   #15 0xaaaabf95f390 in main perf.c:558
>   #16 0xffffaab97400 in __libc_start_call_main libc_start_call_main.h:74
>   #17 0xffffaab974d8 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
>   #18 0xaaaabf8aa8f0 in _start perf[7a8f0]
> 
> Fix it by inserting into the queues based on CPU number, rather than
> using the index.
> 
> Fixes: 811082e4b668 ("perf parse-events: Support user CPUs mixed with threads/processes")
> Signed-off-by: James Clark <james.clark at linaro.org>
> ---
>  tools/perf/util/cs-etm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 25d56e0f1c07..12b55c2bc2ca 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -3086,7 +3086,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>  
>  	if (aux_offset >= auxtrace_event->offset &&
>  	    aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
> -		struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
> +		struct cs_etm_queue *etmq = cs_etm__get_queue(etm, auxtrace_event->cpu);
>  
>  		/*
>  		 * If this AUX event was inside this buffer somewhere, create a new auxtrace event
> @@ -3095,6 +3095,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>  		auxtrace_fragment.auxtrace = *auxtrace_event;
>  		auxtrace_fragment.auxtrace.size = aux_size;
>  		auxtrace_fragment.auxtrace.offset = aux_offset;
> +		auxtrace_fragment.auxtrace.idx = etmq->queue_nr;
>  		file_offset += aux_offset - auxtrace_event->offset + auxtrace_event->header.size;
>  
>  		pr_debug3("CS ETM: Queue buffer size: %#"PRI_lx64" offset: %#"PRI_lx64
> 
> -- 
> 2.34.1
> 
> _______________________________________________
> CoreSight mailing list -- coresight at lists.linaro.org
> To unsubscribe send an email to coresight-leave at lists.linaro.org



More information about the linux-arm-kernel mailing list