[PATCH v3 2/8] drivers/perf: hisi: Improve the detection of associated CPUs

Yicong Yang yangyicong at huawei.com
Wed Nov 6 00:33:18 PST 2024


Hi Will,

Further comment on this patch?

On 2024/10/30 22:04, Yicong Yang wrote:
> On 2024/10/29 21:28, Will Deacon wrote:
>> On Sat, Oct 26, 2024 at 03:24:18PM +0800, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong at hisilicon.com>
>>>
>>> Currently the associated CPUs are detected in the cpuhp online
>>> callback. If the CPU's sccl_id or the ccl_id matches the PMU's,
>>> they're associated. There's an exception that some PMUs locate
>>> on the SICL and will match no CPUs. The events of these PMUs
>>> can be opened on any online CPUs. To handle this we just check
>>> whether the PMU's sccl_id is -1, if so we know it locates on
>>> SICL and make any CPU associated to it.
>>>
>>> This can be tweaked so in this patch just do the below changes:
>>> - If the PMU doesn't match any CPU then associated it to online CPUs
>>> - Choose the target CPU according to the NUMA affinity for opening
>>>   events
>>>
>>> The function is implemented by hisi_pmu_init_associated_cpus() and
>>> invoked in hisi_pmu_init().
>>>
>>> Also the associated_cpus are maintained with all the online CPUs. This
>>> is redundant since we'll always schedule the events on the online CPUs.
>>> Get rid of this and make associated_cpus contain offline CPUs as well.
>>
>> I don't really understand the rationale for this change. Why is the new
>> behaviour better than the old one?
>>
> 
> Thanks for taking a look.
> 
> As mentioned in the commit, we have 2 types of PMU here:
> 1) PMUs locate on SCCL (Super CPU Cluster *), associated with certain CCL(CPU cluster *)(e.g. L3C PMU)
>    or not (e.g. DDRC PMU)
> 2) PMUs locate on the SICL (Super IO Cluster *), which has no association with certain CPU
>    topology (e.g. CPA PMU)
> 
> Currently we find associated CPUs in the cpuhp callbacks by comparing the CPU's
> MPIDR with the PMU's SCCL ID and CCL ID. This will be fine for type 1) PMUs but
> not for type 2) PMUs since no CPU will match a SICL ID. We do a trick here,
> make type 2) PMUs's SCCL to -1 and match all the CPUs. So in fact type 2) PMUs
> are already associated with online CPUs but in an implicit way. With this patch
> the behaviour will be more clear.
> 
> Another thing is about NUMA locality. Usually for type 2) PMUs they maybe on
> different NUMA node which is not considered in current approach, all SICL PMUs
> are stacked on CPU0. With the patch the NUMA node information will be considered
> if provided by the firmware.
> 
> [*] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/perf/hisi-pmu.rst?h=v6.12-rc1
> 

Another thing is that if the associated CPUs are all offline, previous approach
will show -1 by cpumask while with this patch user will always see a usable CPU
(this is ok since the uncore events are is not associated with certain CPU context
and can be open on any online CPU):

// In the previous approach
[root at localhost devices]# cat hisi_sccl3_l3c1/cpumask
8
// offline all the CPUs sharing sccl3_l3c1
[root at localhost devices]# cat hisi_sccl3_l3c1/cpumask
-1

This should make no difference for the function since perf tool will find an online
CPU for opening the event even if it gets -1 from the cpumask. But -1 maybe a bit
confusing.

Thanks.

>>>
>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron at huawei.com>
>>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>>> ---
>>>  drivers/perf/hisilicon/hisi_uncore_pmu.c | 56 +++++++++++++++++++-----
>>>  drivers/perf/hisilicon/hisi_uncore_pmu.h |  5 +++
>>>  2 files changed, 49 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
>>> index 416f72a813fc..c3549e16e0c3 100644
>>> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
>>> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
>>> @@ -399,6 +399,27 @@ void hisi_uncore_pmu_disable(struct pmu *pmu)
>>>  }
>>>  EXPORT_SYMBOL_NS_GPL(hisi_uncore_pmu_disable, HISI_PMU);
>>>  
>>> +static void hisi_pmu_init_associated_cpus(struct hisi_pmu *hisi_pmu)
>>> +{
>>> +	/*
>>> +	 * If the associated_cpus has already been initialized, for example
>>> +	 * determined by comparing the sccl_id and ccl_id with the CPU's
>>> +	 * mpidr_el1, then do nothing here. Otherwise the PMU has no affinity
>>> +	 * and could be opened on any online CPU.
>>> +	 */
>>> +	if (!cpumask_empty(&hisi_pmu->associated_cpus))
>>> +		return;
>>> +
>>> +	cpumask_copy(&hisi_pmu->associated_cpus, cpu_online_mask);
>>
>> Is it always safe to access 'cpu_online_mask' here?
>>
> 
> It should be safe. This function will be called in hisi_pmu_init() in the module
> init stage so the cpu_online_mask() should be populated, at least the current
> running CPU is contained. The hisi_pmu->associated_cpus will be used along
> with the online CPU checking in cpuhp callbacks so we're unlikely to use an
> offline CPU mistakenly. Checked cpumask_local_spread() and it also use the
> cpu_online_mask directly without synchronization against cpuhp process, so
> it also should be ok here. Please correct my if any cases I missed.
> 
> Thanks,
> Yicong
> 
> 
> 
> .
> 



More information about the linux-arm-kernel mailing list