[PATCH v2 2/2] perf: arm_pmuv3: Don't use PMCCNTR_EL0 on SMT cores

Yicong Yang yangyicong at huawei.com
Fri Sep 19 03:27:54 PDT 2025


On 2025/9/19 17:16, Mark Rutland wrote:
> On Fri, Sep 19, 2025 at 04:56:18PM +0800, Yicong Yang wrote:
>> On 2025/9/18 21:32, Will Deacon wrote:
>>> On Wed, Aug 20, 2025 at 04:45:34PM +0800, Yicong Yang wrote:
> 
>>>> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
>>>> index 5c310e803dd7..137ef55d6973 100644
>>>> --- a/drivers/perf/arm_pmu.c
>>>> +++ b/drivers/perf/arm_pmu.c
>>>> @@ -901,6 +901,9 @@ struct arm_pmu *armpmu_alloc(void)
>>>>  
>>>>  		events = per_cpu_ptr(pmu->hw_events, cpu);
>>>>  		events->percpu_pmu = pmu;
>>>> +
>>>> +		if (!pmu->has_smt && topology_core_has_smt(cpu))
>>>> +			pmu->has_smt = true;
>>>
>>> Why isn't that just:
>>>
>>> 	pmu->has_smt = topology_core_has_smt(cpu);
>>>
>>> ?
>>
>> also works. since one pmu only contains one type of CPU, so just thought
>> no need to set it multiple times.
>>
>>> but then if that's the case, why do we need to stash the result in the
>>> PMU at all?
>>
>> should based on the discussion here [1]. stash it during probe will avoid
>> calling {raw_}smp_processor_id() in pmu::event_init() which may be
>> horrible for debug in some condition.
>>
>> [1] https://lore.kernel.org/linux-arm-kernel/aJsV7nzlILHd_ZMa@J2N7QTR9R3/
> 
> This isn't about being 'horrible for debug'; my comment there was saying
> that the proposed patch was incorrect AND it would be horrible to debug
> that in practice when it inevitably went wrong.
> 
> The key details are:
> 
> (1) We need pmu::event_init() to know whether the cycle counter can be
>     used such that it doesn't permit a group to be created which can
>     *NEVER* be scheduled in hardware. Otherwise, the core perf code will
>     waste time periodically trying to schedule that group when it will
>     *ALWAYS* be rejected by pmu::add().
> 
> (2) The pmu::event_init() call runs in a preemptible context and can
>     run on any CPU in the system, completely independent of the PMU's
>     supported CPUs. Thus [raw_]smp_processor_id() tells you nothing
>     about the CPU(s) the event will run on.
> 
>     Note that for task-bound events, the event->cpu is -1, so that
>     doesn't tell us either. Only the PMU instance tells us the set of
>     CPUs.
> 

yes this is the problem in the last approach using [raw_]smp_processor_id()
in pmu::event_init().. sorry for the wrong information replied above and
thanks for help me recall this..

> We can solve that by either stashing this boolean flag at probe time OR
> having pmu::event_init() check something like:
> 
> 	topology_core_has_smt(cpumask_first(pmu->supported_cpus));
> 

this works. I didn't think of this approach... pmu->supported_cpus may contain
offline CPUs but it doesn't matter since topology_core_has_smt() can also
retieve the SMT implementation for offline CPU.

> ... and I think stashing at probe time is nicer/clearer.
> 

I feel similar. will wait for Will's comments :)

thanks.



More information about the linux-arm-kernel mailing list