[PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps

James Clark james.clark at linaro.org
Mon Jan 19 03:55:53 PST 2026



On 19/01/2026 11:15 am, Leo Yan wrote:
> On Mon, Jan 19, 2026 at 10:18:35AM +0000, Coresight ML wrote:
>> The ETM decoder incorrectly assumed that auxtrace queue indices were
>> equivalent to CPU number. This assumption is used for inserting records
>> into the queue, and for fetching queues when given a CPU number. This
>> assumption held when Perf always opened a dummy event on every CPU, even
>> if the user provided a subset of CPUs on the commandline, resulting in
>> the indices aligning.
>>
>> For example:
>>
>>    # event : name = cs_etm//u, , id = { 2451, 2452 }, type = 11 (cs_etm), size = 136, config = 0x4010, { sample_period, samp>
>>    # event : name = dummy:u, , id = { 2453, 2454, 2455, 2456 }, type = 1 (PERF_TYPE_SOFTWARE), size = 136, config = 0x9 (PER>
>>
>>    0 0 0x200 [0xd0]: PERF_RECORD_ID_INDEX nr: 6
>>    ... id: 2451  idx: 2  cpu: 2  tid: -1
>>    ... id: 2452  idx: 3  cpu: 3  tid: -1
>>    ... id: 2453  idx: 0  cpu: 0  tid: -1
>>    ... id: 2454  idx: 1  cpu: 1  tid: -1
>>    ... id: 2455  idx: 2  cpu: 2  tid: -1
>>    ... id: 2456  idx: 3  cpu: 3  tid: -1
>>
>> Since commit 811082e4b668 ("perf parse-events: Support user CPUs mixed
>> with threads/processes") the dummy event no longer behaves in this way,
>> making the ETM event indices start from 0 on the first CPU recorded
>> regardless of its ID:
>>
>>    # event : name = cs_etm//u, , id = { 771, 772 }, type = 11 (cs_etm), size = 144, config = 0x4010, { sample_period, sample>
>>    # event : name = dummy:u, , id = { 773, 774 }, type = 1 (PERF_TYPE_SOFTWARE), size = 144, config = 0x9 (PERF_COUNT_SW_DUM>
>>
>>    0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
>>    ... id: 771  idx: 0  cpu: 2  tid: -1
>>    ... id: 772  idx: 1  cpu: 3  tid: -1
>>    ... id: 773  idx: 0  cpu: 2  tid: -1
>>    ... id: 774  idx: 1  cpu: 3  tid: -1
> 
> Seems to me that this patch works around the issue by using the CPU ID
> instead, but event->auxtrace.idx is broken.
> 
> Should we store the correct index in event->auxtrace.idx (e.g., in the
> __perf_event__synthesize_id_index()) ?
> 
> Thanks,
> Leo
> 

I don't think it's a workaround, I think it should have been written 
this way in the first place.

Idx is just what it says it is, it's the position in the array. If there 
are only two events then they are idx 0 and 1, but CPU can be anything. 
If ETM didn't open a dummy event at the same time then I think it always 
would have behaved like that.

>> This causes the following segfault when decoding:
>>
>>    $ perf record -e cs_etm//u -C 2,3 -- true
>>    $ perf report
>>
>>    perf: Segmentation fault
>>    -------- backtrace --------
>>    #0 0xaaaabf9fd020 in ui__signal_backtrace setup.c:110
>>    #1 0xffffab5c7930 in __kernel_rt_sigreturn [vdso][930]
>>    #2 0xaaaabfb68d30 in cs_etm_decoder__reset cs-etm-decoder.c:85
>>    #3 0xaaaabfb65930 in cs_etm__get_data_block cs-etm.c:2032
>>    #4 0xaaaabfb666fc in cs_etm__run_per_cpu_timeless_decoder cs-etm.c:2551
>>    #5 0xaaaabfb6692c in (cs_etm__process_timeless_queues cs-etm.c:2612
>>    #6 0xaaaabfb63390 in cs_etm__flush_events cs-etm.c:921
>>    #7 0xaaaabfb324c0 in auxtrace__flush_events auxtrace.c:2915
>>    #8 0xaaaabfaac378 in __perf_session__process_events session.c:2285
>>    #9 0xaaaabfaacc9c in perf_session__process_events session.c:2442
>>    #10 0xaaaabf8d3d90 in __cmd_report builtin-report.c:1085
>>    #11 0xaaaabf8d6944 in cmd_report builtin-report.c:1866
>>    #12 0xaaaabf95ebfc in run_builtin perf.c:351
>>    #13 0xaaaabf95eeb0 in handle_internal_command perf.c:404
>>    #14 0xaaaabf95f068 in run_argv perf.c:451
>>    #15 0xaaaabf95f390 in main perf.c:558
>>    #16 0xffffaab97400 in __libc_start_call_main libc_start_call_main.h:74
>>    #17 0xffffaab974d8 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
>>    #18 0xaaaabf8aa8f0 in _start perf[7a8f0]
>>
>> Fix it by inserting into the queues based on CPU number, rather than
>> using the index.
>>
>> Fixes: 811082e4b668 ("perf parse-events: Support user CPUs mixed with threads/processes")
>> Signed-off-by: James Clark <james.clark at linaro.org>
>> ---
>>   tools/perf/util/cs-etm.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>> index 25d56e0f1c07..12b55c2bc2ca 100644
>> --- a/tools/perf/util/cs-etm.c
>> +++ b/tools/perf/util/cs-etm.c
>> @@ -3086,7 +3086,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>>   
>>   	if (aux_offset >= auxtrace_event->offset &&
>>   	    aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
>> -		struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
>> +		struct cs_etm_queue *etmq = cs_etm__get_queue(etm, auxtrace_event->cpu);
>>   
>>   		/*
>>   		 * If this AUX event was inside this buffer somewhere, create a new auxtrace event
>> @@ -3095,6 +3095,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>>   		auxtrace_fragment.auxtrace = *auxtrace_event;
>>   		auxtrace_fragment.auxtrace.size = aux_size;
>>   		auxtrace_fragment.auxtrace.offset = aux_offset;
>> +		auxtrace_fragment.auxtrace.idx = etmq->queue_nr;
>>   		file_offset += aux_offset - auxtrace_event->offset + auxtrace_event->header.size;
>>   
>>   		pr_debug3("CS ETM: Queue buffer size: %#"PRI_lx64" offset: %#"PRI_lx64
>>
>> -- 
>> 2.34.1
>>
>> _______________________________________________
>> CoreSight mailing list -- coresight at lists.linaro.org
>> To unsubscribe send an email to coresight-leave at lists.linaro.org




More information about the linux-arm-kernel mailing list