[PATCH 2/2] arm64/watchdog_hld: Add a cpufreq notifier for update watchdog thresh

Tue Mar 11 01:32:53 PDT 2025

On 2025/3/8 0:53, Doug Anderson wrote:
> Hi,
> 
> On Thu, Mar 6, 2025 at 6:18 PM Yicong Yang <yangyicong at huawei.com> wrote:
>>
>> From: Yicong Yang <yangyicong at hisilicon.com>
>>
>> arm64 depends on the cpufreq driver to gain the maximum cpu frequency
>> to convert the watchdog_thresh to perf event period. cpufreq drivers
>> like cppc_cpufreq will be initialized lately after the initializing of
>> the hard lockup detector so just use a safe cpufreq which will be
>> inaccurency. Use a cpufreq notifier to adjust the event's period to
>> a more accurate one.
>>
>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>> ---
>>  arch/arm64/kernel/watchdog_hld.c | 34 ++++++++++++++++++++++++++++++++
>>  1 file changed, 34 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/watchdog_hld.c b/arch/arm64/kernel/watchdog_hld.c
>> index dcd25322127c..719ae39e8bc0 100644
>> --- a/arch/arm64/kernel/watchdog_hld.c
>> +++ b/arch/arm64/kernel/watchdog_hld.c
>> @@ -34,3 +34,37 @@ bool __init arch_perf_nmi_is_available(void)
>>          */
>>         return arm_pmu_irq_is_nmi();
>>  }
>> +
>> +static int watchdog_freq_notifier_callback(struct notifier_block *nb,
>> +                                          unsigned long val, void *data)
>> +{
>> +       struct cpufreq_policy *policy = data;
>> +       unsigned long max_cpu_freq;
> 
> Feels like max_cpu_freq should be a u64. It probably should have been
> u64 in hw_nmi_get_sample_period() as well. Otherwise you could end up
> with a problem if the max CPU frequency is above ~4.3GHz and "unsigned
> long" is 32-bits (I think it technically could be?)
> 

I guess this won't happen for arm64 and we're in kernel space. here I
keep the same type with hw_nmi_get_sample_period() which is also using
"unsigned long".

> 
>> +       u64 new_period;
>> +       int cpu;
>> +
>> +       if (val != CPUFREQ_CREATE_POLICY)
>> +               return 0;
> 
> nit: maybe better documenting to return "NOTIFY_DONE" instead of 0?
> 

will use NOTIFY_NONE.

> 
>> +       for_each_cpu(cpu, policy->cpus) {
>> +               max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
>> +               if (!max_cpu_freq)
>> +                       continue;
>> +
>> +               new_period = watchdog_thresh * max_cpu_freq;
>> +               hardlockup_detector_perf_adjust_period(cpu, new_period);
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static struct notifier_block watchdog_freq_notifier = {
>> +       .notifier_call = watchdog_freq_notifier_callback,
>> +};
>> +
>> +static int __init init_watchdog_freq_notifier(void)
>> +{
>> +       return cpufreq_register_notifier(&watchdog_freq_notifier,
>> +                                        CPUFREQ_POLICY_NOTIFIER);
>> +}
>> +core_initcall(init_watchdog_freq_notifier);
> 
> I haven't dug into everything, but don't you have a potential blind
> spot, though? What if the order is this:
> 
> 1. hw_nmi_get_sample_period() is called
> 2. cpufreq driver probes
> 3. init_watchdog_freq_notifier() initcall runs
> 
> Since the cpufreq driver proved before you registered for the
> notifier, won't you miss the creation of the policy? Maybe instead of
> registering for an initcall you should somehow trigger off the first
> call to hw_nmi_get_sample_period() and register at that point in time?
> 

for arm64 the watchdog_perf is registered after armv8 PMU driver's init,
so the sequence is like:

# core_initcall
init_watchdog_freq_notifier()
# device_initcall
armv8_pmu_driver_init()
  lockup_detector_retry_init()
    schedule_work(&detector_work);
# in detector work
hw_nmi_get_sample_period() // notifier's already been registered

so we'll always get the correct frequency, either by the cpufreq
already registered or by the notifier which will update the
frequency. Please correct me if I missed anything :)

Thanks.