[PATCH v3 2/8] drivers/perf: hisi: Improve the detection of associated CPUs
Yicong Yang
yangyicong at huawei.com
Wed Nov 6 00:33:18 PST 2024
Hi Will,
Further comment on this patch?
On 2024/10/30 22:04, Yicong Yang wrote:
> On 2024/10/29 21:28, Will Deacon wrote:
>> On Sat, Oct 26, 2024 at 03:24:18PM +0800, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong at hisilicon.com>
>>>
>>> Currently the associated CPUs are detected in the cpuhp online
>>> callback. If the CPU's sccl_id or the ccl_id matches the PMU's,
>>> they're associated. There's an exception that some PMUs locate
>>> on the SICL and will match no CPUs. The events of these PMUs
>>> can be opened on any online CPUs. To handle this we just check
>>> whether the PMU's sccl_id is -1, if so we know it locates on
>>> SICL and make any CPU associated to it.
>>>
>>> This can be tweaked so in this patch just do the below changes:
>>> - If the PMU doesn't match any CPU then associated it to online CPUs
>>> - Choose the target CPU according to the NUMA affinity for opening
>>> events
>>>
>>> The function is implemented by hisi_pmu_init_associated_cpus() and
>>> invoked in hisi_pmu_init().
>>>
>>> Also the associated_cpus are maintained with all the online CPUs. This
>>> is redundant since we'll always schedule the events on the online CPUs.
>>> Get rid of this and make associated_cpus contain offline CPUs as well.
>>
>> I don't really understand the rationale for this change. Why is the new
>> behaviour better than the old one?
>>
>
> Thanks for taking a look.
>
> As mentioned in the commit, we have 2 types of PMU here:
> 1) PMUs locate on SCCL (Super CPU Cluster *), associated with certain CCL(CPU cluster *)(e.g. L3C PMU)
> or not (e.g. DDRC PMU)
> 2) PMUs locate on the SICL (Super IO Cluster *), which has no association with certain CPU
> topology (e.g. CPA PMU)
>
> Currently we find associated CPUs in the cpuhp callbacks by comparing the CPU's
> MPIDR with the PMU's SCCL ID and CCL ID. This will be fine for type 1) PMUs but
> not for type 2) PMUs since no CPU will match a SICL ID. We do a trick here,
> make type 2) PMUs's SCCL to -1 and match all the CPUs. So in fact type 2) PMUs
> are already associated with online CPUs but in an implicit way. With this patch
> the behaviour will be more clear.
>
> Another thing is about NUMA locality. Usually for type 2) PMUs they maybe on
> different NUMA node which is not considered in current approach, all SICL PMUs
> are stacked on CPU0. With the patch the NUMA node information will be considered
> if provided by the firmware.
>
> [*] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/perf/hisi-pmu.rst?h=v6.12-rc1
>
Another thing is that if the associated CPUs are all offline, previous approach
will show -1 by cpumask while with this patch user will always see a usable CPU
(this is ok since the uncore events are is not associated with certain CPU context
and can be open on any online CPU):
// In the previous approach
[root at localhost devices]# cat hisi_sccl3_l3c1/cpumask
8
// offline all the CPUs sharing sccl3_l3c1
[root at localhost devices]# cat hisi_sccl3_l3c1/cpumask
-1
This should make no difference for the function since perf tool will find an online
CPU for opening the event even if it gets -1 from the cpumask. But -1 maybe a bit
confusing.
Thanks.
>>>
>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron at huawei.com>
>>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>>> ---
>>> drivers/perf/hisilicon/hisi_uncore_pmu.c | 56 +++++++++++++++++++-----
>>> drivers/perf/hisilicon/hisi_uncore_pmu.h | 5 +++
>>> 2 files changed, 49 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
>>> index 416f72a813fc..c3549e16e0c3 100644
>>> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
>>> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
>>> @@ -399,6 +399,27 @@ void hisi_uncore_pmu_disable(struct pmu *pmu)
>>> }
>>> EXPORT_SYMBOL_NS_GPL(hisi_uncore_pmu_disable, HISI_PMU);
>>>
>>> +static void hisi_pmu_init_associated_cpus(struct hisi_pmu *hisi_pmu)
>>> +{
>>> + /*
>>> + * If the associated_cpus has already been initialized, for example
>>> + * determined by comparing the sccl_id and ccl_id with the CPU's
>>> + * mpidr_el1, then do nothing here. Otherwise the PMU has no affinity
>>> + * and could be opened on any online CPU.
>>> + */
>>> + if (!cpumask_empty(&hisi_pmu->associated_cpus))
>>> + return;
>>> +
>>> + cpumask_copy(&hisi_pmu->associated_cpus, cpu_online_mask);
>>
>> Is it always safe to access 'cpu_online_mask' here?
>>
>
> It should be safe. This function will be called in hisi_pmu_init() in the module
> init stage so the cpu_online_mask() should be populated, at least the current
> running CPU is contained. The hisi_pmu->associated_cpus will be used along
> with the online CPU checking in cpuhp callbacks so we're unlikely to use an
> offline CPU mistakenly. Checked cpumask_local_spread() and it also use the
> cpu_online_mask directly without synchronization against cpuhp process, so
> it also should be ok here. Please correct my if any cases I missed.
>
> Thanks,
> Yicong
>
>
>
> .
>
More information about the linux-arm-kernel
mailing list