[PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event

James Clark james.clark at arm.com
Fri Jun 25 06:25:15 PDT 2021



On 19/05/2021 08:19, Leo Yan wrote:
> When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
> perf event) for processing trace data, which is needless and even might
> cause logic error, e.g. it might fail to correlate perf events with Arm
> SPE events correctly.
> 
> So this patch removes the condition checking for PERF_RECORD_EXIT event.
> 
> Signed-off-by: Leo Yan <leo.yan at linaro.org>
> ---
>  tools/perf/util/arm-spe.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 5c5b438584c4..58b7069c5a5f 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -717,11 +717,7 @@ static int arm_spe_process_event(struct perf_session *session,
>  					sample->time);
>  		}
>  	} else if (timestamp) {
> -		if (event->header.type == PERF_RECORD_EXIT) {
> -			err = arm_spe_process_queues(spe, timestamp);
> -			if (err)
> -				return err;
> -		}
> +		err = arm_spe_process_queues(spe, timestamp);
>  	}
>  
>  	return err;
> 

For the whole set:
Reviewed-by: James Clark <james.clark at arm.com>
Tested-by: James Clark <james.clark at arm.com>

I see a big improvement in decoding involving multiple processes because the timestamps are now
correlated with the comm and mmap events.

For example perf-exec samples are visible right before the exec is done, and on an
application that forks, samples are visible from all processes. For example:

   perf record -e arm_spe// -- bash -c "stress -c 1"
   perf script

   perf-exec  4502 [003] 259755.050409:          1    l1d-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
   perf-exec  4502 [003] 259755.050409:          1    tlb-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
   perf-exec  4502 [003] 259755.050409:          1        memory:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
   perf-exec  4502 [003] 259755.050411:          1    tlb-access:  ffff800010120fb8 __rcu_read_lock+0x0 ([kernel.kallsyms])
   bash  4502 [003] 259755.050411:          1   branch-miss:  ffff8000105b2a40 memcpy+0x80 ([kernel.kallsyms])
   bash  4502 [003] 259755.050411:          1    tlb-access:                 0 [unknown] ([unknown])
   ...
   stress  4502 [003] 259755.051468:          1    l1d-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
   stress  4502 [003] 259755.051468:          1    tlb-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
   stress  4502 [003] 259755.051468:          1        memory:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])

Previously samples were only attributed to 'stress', which was obviously wrong.

James




More information about the linux-arm-kernel mailing list