[PATCH 2/2] arm64/watchdog_hld: Add a cpufreq notifier for update watchdog thresh

Thu Mar 13 19:03:10 PDT 2025

Hi,

On Wed, Mar 12, 2025 at 12:48 AM Yicong Yang <yangyicong at huawei.com> wrote:
>
> >>>> +static int __init init_watchdog_freq_notifier(void)
> >>>> +{
> >>>> +       return cpufreq_register_notifier(&watchdog_freq_notifier,
> >>>> +                                        CPUFREQ_POLICY_NOTIFIER);
> >>>> +}
> >>>> +core_initcall(init_watchdog_freq_notifier);
> >>>
> >>> I haven't dug into everything, but don't you have a potential blind
> >>> spot, though? What if the order is this:
> >>>
> >>> 1. hw_nmi_get_sample_period() is called
> >>> 2. cpufreq driver probes
> >>> 3. init_watchdog_freq_notifier() initcall runs
> >>>
> >>> Since the cpufreq driver proved before you registered for the
> >>> notifier, won't you miss the creation of the policy? Maybe instead of
> >>> registering for an initcall you should somehow trigger off the first
> >>> call to hw_nmi_get_sample_period() and register at that point in time?
> >>>
> >>
> >> for arm64 the watchdog_perf is registered after armv8 PMU driver's init,
> >> so the sequence is like:
> >>
> >> # core_initcall
> >> init_watchdog_freq_notifier()
> >> # device_initcall
> >> armv8_pmu_driver_init()
> >>   lockup_detector_retry_init()
> >>     schedule_work(&detector_work);
> >> # in detector work
> >> hw_nmi_get_sample_period() // notifier's already been registered
> >>
> >> so we'll always get the correct frequency, either by the cpufreq
> >> already registered or by the notifier which will update the
> >> frequency. Please correct me if I missed anything :)
> >
> > Hmm, OK. So this relies on the fact that because the
> > init_watchdog_freq_notifier() is at "core" level that it'll pretty
> > much init before any of the other things. This type of thing is always
> > a little fragile but I guess it's OK.
> >
> > What about if the order is different though? Again, I haven't traced
> > through it all, but I'm thinking:
> >
> > # core_initcall
> > init_watchdog_freq_notifier()
> > # device_initcall
> > <cpufreq driver policy is created>
> >
> > Now hardlockup_detector_perf_adjust_period() will be called before
> > armv8_pmu_driver_init(). Is that a problem?
>
> it doesn't matter, since in this situation:
>
> # device_initcall
> <cpufreq driver policy is created>
>   hardlockup_detector_perf_adjust_period()
>     if (!event) // armv8_pmu_driver_init() not called, not event created
>       return;
> # later after armv8_pmu_driver_init()
> hw_nmi_get_sample_period() // since cpufreq already been registered, we can
>                            // get the accurate cpufreq

Got it. Sorry for asking the dumb questions--I didn't have time to dig
through the code earlier this week. Yeah, looks like it should be OK.

> > I guess even worse would be if async probe was on and the cpufreq
> > driver and PMU driver were running at the same time?
> >
>
> I think there're 2 cases here:
> 1) cpufreq initialized before hw_nmi_get_sample_period() called, in which
>    case the cpufreq is accurate when create the event
> 2) hw_nmi_get_sample_period() called before cpufreq driver's initialized,
>    in which case we rely on the notifier call to adjust the event's period
>
> May need further check to see if there's any race window here, but the worst
> case will be using the inaccurate cpufreq (5GHz) for calculating the period.

OK, sure. _Maybe_ there could be some small race if these two things
are happening at exactly the same time on two different cores:

core 1:
* lockup_detector_delay_init()
* watchdog_hardlockup_probe()
* hardlockup_detector_event_create()
* hw_nmi_get_sample_period()
* No cpufreq driver, so returns SAFE_MAX_CPU_FREQ

core 2:
* cpufreq driver probes + policy is created
* "event" hasn't been created yet, so your new code is a no-op

core 1:
* event gets created based on SAFE_MAX_CPU_FREQ.

Did I get that right?

That doesn't feel _completely_ impossible especially since
lockup_detector_delay_init() is already running on a separate
workqueue. As you said, though, the worst case is it looks like we'll
just end up with how things were before your patch series.

In any case, I guess it's pretty unlikely, but maybe it wouldn't be
very hard to add a mutex in there? It doesn't feel like that would add
much overhead...

-Doug