[PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps

James Clark james.clark at linaro.org
Mon Jan 19 07:16:01 PST 2026



On 19/01/2026 2:53 pm, Leo Yan wrote:
> On Mon, Jan 19, 2026 at 11:55:53AM +0000, James Clark wrote:
> 
> [...]
> 
>>>>     0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
>>>>     ... id: 771  idx: 0  cpu: 2  tid: -1
>>>>     ... id: 772  idx: 1  cpu: 3  tid: -1
>>>>     ... id: 773  idx: 0  cpu: 2  tid: -1
>>>>     ... id: 774  idx: 1  cpu: 3  tid: -1
>>>
>>> Seems to me that this patch works around the issue by using the CPU ID
>>> instead, but event->auxtrace.idx is broken.
>>
>> I don't think it's a workaround, I think it should have been written this
>> way in the first place.
> 
> Agree.  Sorry for confusion in my rely.
> 
> [...]
> 
>>>> @@ -3086,7 +3086,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>>>>    	if (aux_offset >= auxtrace_event->offset &&
>>>>    	    aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
>>>> -		struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
>>>> +		struct cs_etm_queue *etmq = cs_etm__get_queue(etm, auxtrace_event->cpu);
> 
> During decoding, cs_etm allocates its own AUX queues and needs to
> convert the AUX records into a cs_etm-specific format.
> 
> For example:
> 
>    event_id=405 idx=0 cpu=0 tid=6673
>    event_id=406 idx=1 cpu=1 tid=6673
>    event_id=407 idx=2 cpu=3 tid=6673
>    event_id=408 idx=3 cpu=4 tid=6673
>    event_id=409 idx=4 cpu=5 tid=6673
>    `            `      `> CPU ID
>     `> Event ID  `> Generic buffer IDs
> 
> In this case, I deliberately hotplugged off CPU2.  As a result, the
> buffer IDs are consecutive from 0 to 4, while the CPU ID for CPU2 is
> missing.  When adding AUX records to the cs_etm queues, we need to
> convert from the generic buffer index to the corresponding CPU ID.
> 
> So the above change makes sense to me.
> 
>>>>    		/*
>>>>    		 * If this AUX event was inside this buffer somewhere, create a new auxtrace event
>>>> @@ -3095,6 +3095,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>>>>    		auxtrace_fragment.auxtrace = *auxtrace_event;
>>>>    		auxtrace_fragment.auxtrace.size = aux_size;
>>>>    		auxtrace_fragment.auxtrace.offset = aux_offset;
>>>> +		auxtrace_fragment.auxtrace.idx = etmq->queue_nr;
> 
> I am still confused about this.  Because above auxtrace will be queued
> on cs_etm queues, should we convert from the generic buffer index to the
> cs_etm specific one?  E.g.,
> 
>    auxtrace_fragment.auxtrace.idx = auxtrace_event->cpu;
> 

This is basically what the change is already doing. etmq->queue_nr == 
CPU because cs_etm__setup_queue() is called for every CPU up to max_cpu, 
not only for ones that were recorded so it has a 1:1 mapping. Using 
etmq->queue_nr also works in per-thread mode where queue 0 is used, 
which your suggestion doesn't handle. Otherwise they would be equivalent.

To be honest the whole decoder has become hacks on top of hacks. I think 
we might want to do a full from scratch re-write when we come to do the 
proper timestamp to MMAP event correlation or the change to not keep 
resetting the decoder for TRBE wrap mode.

> BTW, if my understanding above is valid, it is good to go through the
> cs_etm.c file for the "idx <-> CPU ID" conversion.
> 

Do you mean there are other mistakes? I did check the rest and did some 
testing and didn't see any issues.

> Thanks,
> Leo
> 
>>>>    		file_offset += aux_offset - auxtrace_event->offset + auxtrace_event->header.size;
>>>>    		pr_debug3("CS ETM: Queue buffer size: %#"PRI_lx64" offset: %#"PRI_lx64
>>>>
>>>> -- 
>>>> 2.34.1
>>>>
>>>> _______________________________________________
>>>> CoreSight mailing list -- coresight at lists.linaro.org
>>>> To unsubscribe send an email to coresight-leave at lists.linaro.org
>>




More information about the linux-arm-kernel mailing list