[PATCH v3 2/8] drivers/perf: hisi: Improve the detection of associated CPUs
Yicong Yang
yangyicong at huawei.com
Wed Oct 30 07:04:43 PDT 2024
On 2024/10/29 21:28, Will Deacon wrote:
> On Sat, Oct 26, 2024 at 03:24:18PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong at hisilicon.com>
>>
>> Currently the associated CPUs are detected in the cpuhp online
>> callback. If the CPU's sccl_id or the ccl_id matches the PMU's,
>> they're associated. There's an exception that some PMUs locate
>> on the SICL and will match no CPUs. The events of these PMUs
>> can be opened on any online CPUs. To handle this we just check
>> whether the PMU's sccl_id is -1, if so we know it locates on
>> SICL and make any CPU associated to it.
>>
>> This can be tweaked so in this patch just do the below changes:
>> - If the PMU doesn't match any CPU then associated it to online CPUs
>> - Choose the target CPU according to the NUMA affinity for opening
>> events
>>
>> The function is implemented by hisi_pmu_init_associated_cpus() and
>> invoked in hisi_pmu_init().
>>
>> Also the associated_cpus are maintained with all the online CPUs. This
>> is redundant since we'll always schedule the events on the online CPUs.
>> Get rid of this and make associated_cpus contain offline CPUs as well.
>
> I don't really understand the rationale for this change. Why is the new
> behaviour better than the old one?
>
Thanks for taking a look.
As mentioned in the commit, we have 2 types of PMU here:
1) PMUs locate on SCCL (Super CPU Cluster *), associated with certain CCL(CPU cluster *)(e.g. L3C PMU)
or not (e.g. DDRC PMU)
2) PMUs locate on the SICL (Super IO Cluster *), which has no association with certain CPU
topology (e.g. CPA PMU)
Currently we find associated CPUs in the cpuhp callbacks by comparing the CPU's
MPIDR with the PMU's SCCL ID and CCL ID. This will be fine for type 1) PMUs but
not for type 2) PMUs since no CPU will match a SICL ID. We do a trick here,
make type 2) PMUs's SCCL to -1 and match all the CPUs. So in fact type 2) PMUs
are already associated with online CPUs but in an implicit way. With this patch
the behaviour will be more clear.
Another thing is about NUMA locality. Usually for type 2) PMUs they maybe on
different NUMA node which is not considered in current approach, all SICL PMUs
are stacked on CPU0. With the patch the NUMA node information will be considered
if provided by the firmware.
[*] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/perf/hisi-pmu.rst?h=v6.12-rc1
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron at huawei.com>
>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>> ---
>> drivers/perf/hisilicon/hisi_uncore_pmu.c | 56 +++++++++++++++++++-----
>> drivers/perf/hisilicon/hisi_uncore_pmu.h | 5 +++
>> 2 files changed, 49 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
>> index 416f72a813fc..c3549e16e0c3 100644
>> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
>> @@ -399,6 +399,27 @@ void hisi_uncore_pmu_disable(struct pmu *pmu)
>> }
>> EXPORT_SYMBOL_NS_GPL(hisi_uncore_pmu_disable, HISI_PMU);
>>
>> +static void hisi_pmu_init_associated_cpus(struct hisi_pmu *hisi_pmu)
>> +{
>> + /*
>> + * If the associated_cpus has already been initialized, for example
>> + * determined by comparing the sccl_id and ccl_id with the CPU's
>> + * mpidr_el1, then do nothing here. Otherwise the PMU has no affinity
>> + * and could be opened on any online CPU.
>> + */
>> + if (!cpumask_empty(&hisi_pmu->associated_cpus))
>> + return;
>> +
>> + cpumask_copy(&hisi_pmu->associated_cpus, cpu_online_mask);
>
> Is it always safe to access 'cpu_online_mask' here?
>
It should be safe. This function will be called in hisi_pmu_init() in the module
init stage so the cpu_online_mask() should be populated, at least the current
running CPU is contained. The hisi_pmu->associated_cpus will be used along
with the online CPU checking in cpuhp callbacks so we're unlikely to use an
offline CPU mistakenly. Checked cpumask_local_spread() and it also use the
cpu_online_mask directly without synchronization against cpuhp process, so
it also should be ok here. Please correct my if any cases I missed.
Thanks,
Yicong
More information about the linux-arm-kernel
mailing list