[PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps
Adrian Hunter
adrian.hunter at intel.com
Mon Jan 19 04:11:43 PST 2026
On 19/01/2026 13:15, Leo Yan wrote:
> On Mon, Jan 19, 2026 at 10:18:35AM +0000, Coresight ML wrote:
>> The ETM decoder incorrectly assumed that auxtrace queue indices were
>> equivalent to CPU number. This assumption is used for inserting records
>> into the queue, and for fetching queues when given a CPU number. This
>> assumption held when Perf always opened a dummy event on every CPU, even
>> if the user provided a subset of CPUs on the commandline, resulting in
>> the indices aligning.
>>
>> For example:
>>
>> # event : name = cs_etm//u, , id = { 2451, 2452 }, type = 11 (cs_etm), size = 136, config = 0x4010, { sample_period, samp>
>> # event : name = dummy:u, , id = { 2453, 2454, 2455, 2456 }, type = 1 (PERF_TYPE_SOFTWARE), size = 136, config = 0x9 (PER>
>>
>> 0 0 0x200 [0xd0]: PERF_RECORD_ID_INDEX nr: 6
>> ... id: 2451 idx: 2 cpu: 2 tid: -1
>> ... id: 2452 idx: 3 cpu: 3 tid: -1
>> ... id: 2453 idx: 0 cpu: 0 tid: -1
>> ... id: 2454 idx: 1 cpu: 1 tid: -1
>> ... id: 2455 idx: 2 cpu: 2 tid: -1
>> ... id: 2456 idx: 3 cpu: 3 tid: -1
>>
>> Since commit 811082e4b668 ("perf parse-events: Support user CPUs mixed
>> with threads/processes") the dummy event no longer behaves in this way,
>> making the ETM event indices start from 0 on the first CPU recorded
>> regardless of its ID:
>>
>> # event : name = cs_etm//u, , id = { 771, 772 }, type = 11 (cs_etm), size = 144, config = 0x4010, { sample_period, sample>
>> # event : name = dummy:u, , id = { 773, 774 }, type = 1 (PERF_TYPE_SOFTWARE), size = 144, config = 0x9 (PERF_COUNT_SW_DUM>
>>
>> 0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
>> ... id: 771 idx: 0 cpu: 2 tid: -1
>> ... id: 772 idx: 1 cpu: 3 tid: -1
>> ... id: 773 idx: 0 cpu: 2 tid: -1
>> ... id: 774 idx: 1 cpu: 3 tid: -1
>
> Seems to me that this patch works around the issue by using the CPU ID
> instead, but event->auxtrace.idx is broken.
>
> Should we store the correct index in event->auxtrace.idx (e.g., in the
> __perf_event__synthesize_id_index()) ?
The idx value represents a perf events ring buffer. Events on the same
CPU can share the same ring buffer. But in the case of per-thread
recording, different threads have different ring buffers and therefore
different idx values.
So I don't think the idx value is wrong. It is just not the same thing
as CPU number.
More information about the linux-arm-kernel
mailing list