[PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps

Adrian Hunter adrian.hunter at intel.com
Mon Jan 19 04:11:43 PST 2026


On 19/01/2026 13:15, Leo Yan wrote:
> On Mon, Jan 19, 2026 at 10:18:35AM +0000, Coresight ML wrote:
>> The ETM decoder incorrectly assumed that auxtrace queue indices were
>> equivalent to CPU number. This assumption is used for inserting records
>> into the queue, and for fetching queues when given a CPU number. This
>> assumption held when Perf always opened a dummy event on every CPU, even
>> if the user provided a subset of CPUs on the commandline, resulting in
>> the indices aligning.
>>
>> For example:
>>
>>   # event : name = cs_etm//u, , id = { 2451, 2452 }, type = 11 (cs_etm), size = 136, config = 0x4010, { sample_period, samp>
>>   # event : name = dummy:u, , id = { 2453, 2454, 2455, 2456 }, type = 1 (PERF_TYPE_SOFTWARE), size = 136, config = 0x9 (PER>
>>
>>   0 0 0x200 [0xd0]: PERF_RECORD_ID_INDEX nr: 6
>>   ... id: 2451  idx: 2  cpu: 2  tid: -1
>>   ... id: 2452  idx: 3  cpu: 3  tid: -1
>>   ... id: 2453  idx: 0  cpu: 0  tid: -1
>>   ... id: 2454  idx: 1  cpu: 1  tid: -1
>>   ... id: 2455  idx: 2  cpu: 2  tid: -1
>>   ... id: 2456  idx: 3  cpu: 3  tid: -1
>>
>> Since commit 811082e4b668 ("perf parse-events: Support user CPUs mixed
>> with threads/processes") the dummy event no longer behaves in this way,
>> making the ETM event indices start from 0 on the first CPU recorded
>> regardless of its ID:
>>
>>   # event : name = cs_etm//u, , id = { 771, 772 }, type = 11 (cs_etm), size = 144, config = 0x4010, { sample_period, sample>
>>   # event : name = dummy:u, , id = { 773, 774 }, type = 1 (PERF_TYPE_SOFTWARE), size = 144, config = 0x9 (PERF_COUNT_SW_DUM>
>>
>>   0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
>>   ... id: 771  idx: 0  cpu: 2  tid: -1
>>   ... id: 772  idx: 1  cpu: 3  tid: -1
>>   ... id: 773  idx: 0  cpu: 2  tid: -1
>>   ... id: 774  idx: 1  cpu: 3  tid: -1
> 
> Seems to me that this patch works around the issue by using the CPU ID
> instead, but event->auxtrace.idx is broken.
> 
> Should we store the correct index in event->auxtrace.idx (e.g., in the
> __perf_event__synthesize_id_index()) ?
The idx value represents a perf events ring buffer.  Events on the same
CPU can share the same ring buffer.  But in the case of per-thread
recording, different threads have different ring buffers and therefore
different idx values.

So I don't think the idx value is wrong.  It is just not the same thing
as CPU number.




More information about the linux-arm-kernel mailing list