[REPOST PATCH v6 3/3] arm64: topology: Handle AMU FIE setup on CPU hotplug

Geert Uytterhoeven geert at linux-m68k.org
Thu Jan 15 00:37:31 PST 2026


Hi Lifeng,

On Thu, 15 Jan 2026 at 03:25, zhenglifeng (A) <zhenglifeng1 at huawei.com> wrote:
> On 2026/1/14 21:54, Geert Uytterhoeven wrote:
> > On Tue, 13 Jan 2026 at 16:58, Beata Michalska <beata.michalska at arm.com> wrote:
> >> On Tue, Jan 13, 2026 at 11:51:45AM +0100, Geert Uytterhoeven wrote:
> >>> On Tue, 30 Dec 2025 at 09:02, Lifeng Zheng <zhenglifeng1 at huawei.com> wrote:
> >>>> Currently, when a cpufreq policy is created, the AMU FIE setup process
> >>>> checks all CPUs in the policy -- including those that are offline. If any
> >>>> of these CPUs are offline at that time, their AMU capability flag hasn't
> >>>> been verified yet, leading the check fail. As a result, AMU FIE is not
> >>>> enabled, even if the CPUs that are online do support it.
> >>>>
> >>>> Later, when the previously offline CPUs come online and report AMU support,
> >>>> there's no mechanism in place to re-enable AMU FIE for the policy. This
> >>>> leaves the entire frequency domain without AMU FIE, despite being eligible.
> >>>>
> >>>> Restrict the initial AMU FIE check to only those CPUs that are online at
> >>>> the time the policy is created, and allow CPUs that come online later to
> >>>> join the policy with AMU FIE enabled.
> >>>>
> >>>> Signed-off-by: Lifeng Zheng <zhenglifeng1 at huawei.com>
> >>>> Acked-by: Beata Michalska <beata.michalska at arm.com>
> >>>
> >>> Thanks for your patch, which is now commit 6fd9be0b7b2e957d
> >>> ("arm64: topology: Handle AMU FIE setup on CPU hotplug") in
> >>> arm64/for-next/core (next-20260107 and later).
> >>>
> >>>> --- a/arch/arm64/kernel/topology.c
> >>>> +++ b/arch/arm64/kernel/topology.c
> >>>> @@ -284,7 +284,7 @@ static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val,
> >>>>         struct cpufreq_policy *policy = data;
> >>>>
> >>>>         if (val == CPUFREQ_CREATE_POLICY)
> >>>> -               amu_fie_setup(policy->related_cpus);
> >>>> +               amu_fie_setup(policy->cpus);
> >>>>
> >>>>         /*
> >>>>          * We don't need to handle CPUFREQ_REMOVE_POLICY event as the AMU
> >>>> @@ -303,10 +303,71 @@ static struct notifier_block init_amu_fie_notifier = {
> >>>>         .notifier_call = init_amu_fie_callback,
> >>>>  };
> >>>>
> >>>> +static int cpuhp_topology_online(unsigned int cpu)
> >>>> +{
> >>>> +       struct cpufreq_policy *policy = cpufreq_cpu_policy(cpu);
> >>>> +
> >>>> +       /* Those are cheap checks */
> >>>> +
> >>>> +       /*
> >>>> +        * Skip this CPU if:
> >>>> +        *  - it has no cpufreq policy assigned yet,
> >>>> +        *  - no policy exists that spans CPUs with AMU counters, or
> >>>> +        *  - it was already handled.
> >>>> +        */
> >>>> +       if (unlikely(!policy) || !cpumask_available(amu_fie_cpus) ||
> >>>> +           cpumask_test_cpu(cpu, amu_fie_cpus))
> >>>> +               return 0;
> >>>> +
> >>>> +       /*
> >>>> +        * Only proceed if all already-online CPUs in this policy
> >>>> +        * support AMU counters.
> >>>> +        */
> >>>> +       if (unlikely(!cpumask_subset(policy->cpus, amu_fie_cpus)))
> >>>> +               return 0;
> >>>> +
> >>>> +       /*
> >>>> +        * If the new online CPU cannot pass this check, all the CPUs related to
> >>>> +        * the same policy should be clear from amu_fie_cpus mask, otherwise they
> >>>> +        * may use different source of the freq scale.
> >>>> +        */
> >>>> +       if (!freq_counters_valid(cpu)) {
> >>>> +               pr_warn("CPU[%u] doesn't support AMU counters\n", cpu);
> >>>
> >>> This is triggered during resume from s2ram on Renesas R-Car H3
> >>> (big.LITTLE 4x Cortex-A57 + 4x Cortex-A53), when enabling the first
> >>> little core:
> >>>
> >>>     AMU: CPU[4] doesn't support AMU counters
> >>>
> >>> Adding debug code:
> >>>
> >>>     pr_info("Calling
> >>> topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_ARCH, %*pbl)\n",
> >>> cpumask_pr_args(policy->related_cpus));
> >>>     pr_info("Calling cpumask_andnot(..., %*pbl, %*pbl)\n",
> >>> cpumask_pr_args(amu_fie_cpus), cpumask_pr_args(policy->related_cpus));
> >>>
> >>> gives:
> >>>
> >>>     AMU: Calling topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_ARCH, 4-7)
> >>>     AMU: Calling cpumask_andnot(..., , 4-7)
> >>>
> >>> so AMU is disabled for all little cores.
> >>>
> >>> Since this only happens during s2ram, and not during initial CPU
> >>> bring-up on boot, this looks wrong to me?
> >> This does look rather surprising. If that CPU was marked as supporting AMUs at
> >> the initial bring-up it should be part of amu_fie_cpus mask, so the hp callback
> >> should bail out straight away. Would you be able to add some logs to see what
> >> that mask actually contains ?
> >> Furthermore, freq_counters_valid is logging issues when validating the counters.
> >> Would you be able to re-run it with the debug level to see what might be
> >> happening under the hood, although I am still unsure why it is even reaching
> >> that point ...
> >
> > Adding extra debugging info, and "#define DEBUG" at the top.
> >
> > During boot:
> >
> >     AMU: amu_fie_setup:260: cpus 0-3 amu_fie_cpus
> >     ^^^ empty amu_fie_cpus
> >     AMU: CPU0: counters are not supported.
> >     ^^^ pr_debug
> >     AMU: amu_fie_setup:260: cpus 4-7 amu_fie_cpus
> >     ^^^ empty amu_fie_cpus
> >     AMU: CPU4: counters are not supported.
> >     ^^^ pr_debug
> >
> > During resume from s2ram:
> >
> >     AMU: cpuhp_topology_online:314: cpu 1 amu_fie_cpus
> >     AMU: cpuhp_topology_online:343: skipped
> > (!cpumask_subset(policy->cpus, amu_fie_cpus))
> >     AMU: cpuhp_topology_online:314: cpu 2 amu_fie_cpus
> >     AMU: cpuhp_topology_online:343: skipped
> > (!cpumask_subset(policy->cpus, amu_fie_cpus))
> >     AMU: cpuhp_topology_online:314: cpu 3 amu_fie_cpus
> >     AMU: cpuhp_topology_online:343: skipped
> > (!cpumask_subset(policy->cpus, amu_fie_cpus))
> >     AMU: cpuhp_topology_online:314: cpu 4 amu_fie_cpus
> >     AMU: CPU4: counters are not supported.
> >     ^^^ pr_debug
> >     AMU: CPU[4] doesn't support AMU counters
> >     ^^^ pr_warn
> >     AMU: Calling topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_ARCH, 4-7)
> >     AMU: Calling cpumask_andnot(..., , 4-7)
>
> Something strange here. If AMU is not supported at all, amu_fie_cpus should
> never be available and cpuhp_topology_online() should return in the first
> 'if'. Why it runs this far?

You mean the "!cpumask_available(amu_fie_cpus)" test?

include/linux/cpumask.h:

    #ifdef CONFIG_CPUMASK_OFFSTACK
    static __always_inline bool cpumask_available(cpumask_var_t mask)
    {
            return mask != NULL;
    }
    #else
    static __always_inline bool cpumask_available(cpumask_var_t mask)
    {
            return true;
    }
    #endif /* CONFIG_CPUMASK_OFFSTACK */

include/linux/cpumask_types.h:

    #ifdef CONFIG_CPUMASK_OFFSTACK
    typedef struct cpumask *cpumask_var_t;
    #else
    typedef struct cpumask cpumask_var_t[1];
    #endif /* CONFIG_CPUMASK_OFFSTACK */

So if CONFIG_CPUMASK_OFFSTACK is not enabled, it always returns true.

arch/arm64/Kconfig:

    config ARM64
        [...]
        select CPUMASK_OFFSTACK if NR_CPUS > 256

> >     AMU: cpuhp_topology_online:314: cpu 5 amu_fie_cpus
> >     AMU: cpuhp_topology_online:343: skipped
> > (!cpumask_subset(policy->cpus, amu_fie_cpus))
> >     AMU: cpuhp_topology_online:314: cpu 6 amu_fie_cpus
> >     AMU: cpuhp_topology_online:343: skipped
> > (!cpumask_subset(policy->cpus, amu_fie_cpus))
> >     AMU: cpuhp_topology_online:314: cpu 7 amu_fie_cpus
> >     AMU: cpuhp_topology_online:343: skipped
> > (!cpumask_subset(policy->cpus, amu_fie_cpus))
> >
> > Hence there is no issue, as AMU is not supported at all!
> >
> > The confusing part is in the (absence of) logging.
> > If AMU is not supported, freq_counters_valid() uses:
> >
> >      pr_debug("CPU%d: counters are not supported.\n", cpu);
> >
> > which is typically not printed, unless DEBUG is enabled.
> >
> > If freq_counters_valid() failed, the new cpuhp_topology_online() uses:
> >
> >     pr_warn("CPU[%u] doesn't support AMU counters\n", cpu);
> >
> > which is always printed.
> >
> > Given freq_counters_valid() already prints a (debug) message, I think
> > the pr_warn() should just be removed.  Do you agree, or is there still
> > another incorrect check that should prevent getting this far?
>
> I'm OK with removing it.

OK, I will send a patch.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



More information about the linux-arm-kernel mailing list