[PATCH 2/2] arm64/watchdog_hld: Add a cpufreq notifier for update watchdog thresh

Fri Mar 14 02:33:54 PDT 2025

On 2025/3/14 10:03, Doug Anderson wrote:
> Hi,
> 
> On Wed, Mar 12, 2025 at 12:48 AM Yicong Yang <yangyicong at huawei.com> wrote:
>>
>>>>>> +static int __init init_watchdog_freq_notifier(void)
>>>>>> +{
>>>>>> +       return cpufreq_register_notifier(&watchdog_freq_notifier,
>>>>>> +                                        CPUFREQ_POLICY_NOTIFIER);
>>>>>> +}
>>>>>> +core_initcall(init_watchdog_freq_notifier);
>>>>>
>>>>> I haven't dug into everything, but don't you have a potential blind
>>>>> spot, though? What if the order is this:
>>>>>
>>>>> 1. hw_nmi_get_sample_period() is called
>>>>> 2. cpufreq driver probes
>>>>> 3. init_watchdog_freq_notifier() initcall runs
>>>>>
>>>>> Since the cpufreq driver proved before you registered for the
>>>>> notifier, won't you miss the creation of the policy? Maybe instead of
>>>>> registering for an initcall you should somehow trigger off the first
>>>>> call to hw_nmi_get_sample_period() and register at that point in time?
>>>>>
>>>>
>>>> for arm64 the watchdog_perf is registered after armv8 PMU driver's init,
>>>> so the sequence is like:
>>>>
>>>> # core_initcall
>>>> init_watchdog_freq_notifier()
>>>> # device_initcall
>>>> armv8_pmu_driver_init()
>>>>   lockup_detector_retry_init()
>>>>     schedule_work(&detector_work);
>>>> # in detector work
>>>> hw_nmi_get_sample_period() // notifier's already been registered
>>>>
>>>> so we'll always get the correct frequency, either by the cpufreq
>>>> already registered or by the notifier which will update the
>>>> frequency. Please correct me if I missed anything :)
>>>
>>> Hmm, OK. So this relies on the fact that because the
>>> init_watchdog_freq_notifier() is at "core" level that it'll pretty
>>> much init before any of the other things. This type of thing is always
>>> a little fragile but I guess it's OK.
>>>
>>> What about if the order is different though? Again, I haven't traced
>>> through it all, but I'm thinking:
>>>
>>> # core_initcall
>>> init_watchdog_freq_notifier()
>>> # device_initcall
>>> <cpufreq driver policy is created>
>>>
>>> Now hardlockup_detector_perf_adjust_period() will be called before
>>> armv8_pmu_driver_init(). Is that a problem?
>>
>> it doesn't matter, since in this situation:
>>
>> # device_initcall
>> <cpufreq driver policy is created>
>>   hardlockup_detector_perf_adjust_period()
>>     if (!event) // armv8_pmu_driver_init() not called, not event created
>>       return;
>> # later after armv8_pmu_driver_init()
>> hw_nmi_get_sample_period() // since cpufreq already been registered, we can
>>                            // get the accurate cpufreq
> 
> Got it. Sorry for asking the dumb questions--I didn't have time to dig
> through the code earlier this week. Yeah, looks like it should be OK.
> 
> 
>>> I guess even worse would be if async probe was on and the cpufreq
>>> driver and PMU driver were running at the same time?
>>>
>>
>> I think there're 2 cases here:
>> 1) cpufreq initialized before hw_nmi_get_sample_period() called, in which
>>    case the cpufreq is accurate when create the event
>> 2) hw_nmi_get_sample_period() called before cpufreq driver's initialized,
>>    in which case we rely on the notifier call to adjust the event's period
>>
>> May need further check to see if there's any race window here, but the worst
>> case will be using the inaccurate cpufreq (5GHz) for calculating the period.
> 
> OK, sure. _Maybe_ there could be some small race if these two things
> are happening at exactly the same time on two different cores:
> 
> core 1:
> * lockup_detector_delay_init()
> * watchdog_hardlockup_probe()
> * hardlockup_detector_event_create()
> * hw_nmi_get_sample_period()
> * No cpufreq driver, so returns SAFE_MAX_CPU_FREQ
> 
> core 2:
> * cpufreq driver probes + policy is created
> * "event" hasn't been created yet, so your new code is a no-op
> 
> core 1:
> * event gets created based on SAFE_MAX_CPU_FREQ.
> 
> Did I get that right?
> 

I think yes, should exist theoretically.

> That doesn't feel _completely_ impossible especially since
> lockup_detector_delay_init() is already running on a separate
> workqueue. As you said, though, the worst case is it looks like we'll
> just end up with how things were before your patch series.
> 
> In any case, I guess it's pretty unlikely, but maybe it wouldn't be
> very hard to add a mutex in there? It doesn't feel like that would add
> much overhead...
> 

I'll see how to do the synchronization here. I need to look a bit deeper into
the cpufreq core. Will update in the next version.

Thanks.