[PATCH 2/4] perf cs-etm: Use previous thread for branch sample source IP

Leo Yan leo.yan at linaro.org
Mon Jun 5 17:44:31 PDT 2023


On Tue, May 30, 2023 at 03:28:09PM +0100, James Clark wrote:

[...]

> > On Wed, May 24, 2023 at 02:19:56PM +0100, James Clark wrote:
> >> Branch samples currently use the IP of the previous packet as the from
> >> IP, and the IP of the current packet as the to IP. But it incorrectly
> >> uses the current thread. In some cases like a jump into a different
> >> exception level this will attribute to the incorrect process.
> > 
> > It's about the timing that branch has taken or not taken :)
> > 
> > If we think the branch sample as 'branch has taken', then current code
> > is doning right thing, otherwise, we need this fix.
> > 
> 
> If you diff the outputs side by side you can see it mainly has an effect
> where there is a discontinuity. At this point we set either the from or
> the to IPs to 0.
> 
> For example here is a before and after perf script output. Without the
> change it looks like stress was running before it actually was. The
> schedule function that was attributed to ls on the first line hasn't
> finished running yet. But it's attributed to stress on the second line
> even though the destination IP is 0 meaning we don't even know where it
> went.

Yeah, this is a good improvement for me.  Thanks for sharing the
detailed comparison result.

> Before:
> 
>     ls  8350 [006] ... __schedule+0x394 => schedule+0x5c
> stress  8357 [006] ... schedule+0x84 => 0 [unknown]
> stress  8357 [006] ... 0 [unknown] => __unix_dgram_recvmsg+0x130
> 
> After:
> 
>     ls  8350 [006] ... __schedule+0x394 => schedule+0x5c
>     ls  8357 [006] ... schedule+0x84 => 0 [unknown]
> stress  8357 [006] ... 0 [unknown] => __unix_dgram_recvmsg+0x130
> 
> I didn't see any decode differences that weren't around these
> discontinuity points, so it seems like a low risk change.

[...]

> >> @@ -480,6 +481,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
> >>  	tidq->trace_chan_id = trace_chan_id;
> >>  	tidq->thread = machine__findnew_thread(&etm->session->machines.host, -1,
> >>  					       queue->tid);
> >> +	tidq->prev_thread = machine__idle_thread(&etm->session->machines.host);
> >>  
> >>  	tidq->packet = zalloc(sizeof(struct cs_etm_packet));
> >>  	if (!tidq->packet)
> >> @@ -616,6 +618,8 @@ static void cs_etm__packet_swap(struct cs_etm_auxtrace *etm,
> >>  		tmp = tidq->packet;
> >>  		tidq->packet = tidq->prev_packet;
> >>  		tidq->prev_packet = tmp;
> >> +		thread__put(tidq->prev_thread);
> >> +		tidq->prev_thread = thread__get(tidq->thread);
> > 
> > Maybe cs_etm__packet_swap() is not the best place to update
> > "tidq->prev_thread", since swapping packet doesn't mean it's necessarily
> > thread switching; can we move this change into the cs_etm__set_thread()?
> > 
> 
> Yeah that might make more sense. I can move it there if we decide to
> keep this change.

Please refine the patch for this.  Thanks and sorry my late replying.

Leo



More information about the linux-arm-kernel mailing list