[PATCH v3 2/3] perf pmus: Simplify perf_pmus__find_core_pmu()

James Clark james.clark at arm.com
Fri Sep 15 04:17:56 PDT 2023



On 13/09/2023 16:33, James Clark wrote:
> Currently the while loop always either exits on the first iteration with
> a core PMU, or exits with NULL on heterogeneous systems or when not all
> CPUs are online.
> 
> Both of the latter behaviors are undesirable for platforms other than
> Arm so simplify it to always return the first core PMU, or NULL if none
> exist.
> 
> This behavior was depended on by the Arm version of
> pmu_metrics_table__find(), so the logic has been moved there instead.
> 
> Suggested-by: Ian Rogers <irogers at google.com>
> Reviewed-by: Ian Rogers <irogers at google.com>
> Signed-off-by: James Clark <james.clark at arm.com>

Turns out the "Simple expression parser" test is failing on
heterogeneous arm systems without this patch. I didn't realise there was
a dependency and should have put the commits the other way round. I will
leave the error message here in case someone bumps into it, but no fix
is required apart from applying the remaining patches in this set:

 $ perf test expr -v
  4: Simple expression parser                                        :
 --- start ---
 test child forked, pid 4902
 Using CPUID 0x00000000410fd070
 FAILED tests/expr.c:83 get_cpuid
 test child finished with -1
 ---- end ----
 Simple expression parser: FAILED!


> ---
>  tools/perf/arch/arm64/util/pmu.c |  8 +++++++-
>  tools/perf/util/pmus.c           | 14 +-------------
>  2 files changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
> index 3d9330feebd2..3099f5f448ba 100644
> --- a/tools/perf/arch/arm64/util/pmu.c
> +++ b/tools/perf/arch/arm64/util/pmu.c
> @@ -10,8 +10,14 @@
>  
>  const struct pmu_metrics_table *pmu_metrics_table__find(void)
>  {
> -	struct perf_pmu *pmu = perf_pmus__find_core_pmu();
> +	struct perf_pmu *pmu;
> +
> +	/* Metrics aren't currently supported on heterogeneous Arm systems */
> +	if (perf_pmus__num_core_pmus() > 1)
> +		return NULL;
>  
> +	/* Doesn't matter which one here because they'll all be the same */
> +	pmu = perf_pmus__find_core_pmu();
>  	if (pmu)
>  		return perf_pmu__find_metrics_table(pmu);
>  
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index cec869cbe163..64e798e68a2d 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -596,17 +596,5 @@ struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
>  
>  struct perf_pmu *perf_pmus__find_core_pmu(void)
>  {
> -	struct perf_pmu *pmu = NULL;
> -
> -	while ((pmu = perf_pmus__scan_core(pmu))) {
> -		/*
> -		 * The cpumap should cover all CPUs. Otherwise, some CPUs may
> -		 * not support some events or have different event IDs.
> -		 */
> -		if (RC_CHK_ACCESS(pmu->cpus)->nr != cpu__max_cpu().cpu)
> -			return NULL;
> -
> -		return pmu;
> -	}
> -	return NULL;
> +	return perf_pmus__scan_core(NULL);
>  }



More information about the linux-arm-kernel mailing list