[PATCH 2/2] PM: sleep: Fix runtime PM based cpuidle support

Rafael J. Wysocki rafael at kernel.org
Thu Oct 21 06:45:09 PDT 2021


On Thu, Oct 21, 2021 at 1:49 PM Ulf Hansson <ulf.hansson at linaro.org> wrote:
>
> On Wed, 20 Oct 2021 at 20:18, Rafael J. Wysocki <rafael at kernel.org> wrote:
> >
> > On Wed, Sep 29, 2021 at 4:44 PM Ulf Hansson <ulf.hansson at linaro.org> wrote:
> > >
> > > In the cpuidle-psci case, runtime PM in combination with the generic PM
> > > domain (genpd), may be used when entering/exiting an idlestate. More
> > > precisely, genpd relies on runtime PM to be enabled for the attached device
> > > (in this case it belongs to a CPU), to properly manage the reference
> > > counting of its PM domain.
> > >
> > > This works fine most of the time, but during system suspend in the
> > > dpm_suspend_late() phase, the PM core disables runtime PM for all devices.
> > > Beyond this point and until runtime PM becomes re-enabled in the
> > > dpm_resume_early() phase, calls to pm_runtime_get|put*() will fail.
> > >
> > > To make sure the reference counting in genpd becomes correct, we need to
> > > prevent cpuidle-psci from using runtime PM when it has been disabled for
> > > the device. Therefore, let's move the call to cpuidle_pause() from
> > > dpm_suspend_noirq() to dpm_suspend_late() - and cpuidle_resume() from
> > > dpm_resume_noirq() into dpm_resume_early().
> > >
> > > Diagnosed-by: Maulik Shah <mkshah at codeaurora.org>
> > > Suggested-by: Maulik Shah <mkshah at codeaurora.org>
> > > Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
> > > ---
> > >  drivers/base/power/main.c | 6 ++----
> > >  1 file changed, 2 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > index cbea78e79f3d..1c753b651272 100644
> > > --- a/drivers/base/power/main.c
> > > +++ b/drivers/base/power/main.c
> > > @@ -747,8 +747,6 @@ void dpm_resume_noirq(pm_message_t state)
> > >
> > >         resume_device_irqs();
> > >         device_wakeup_disarm_wake_irqs();
> > > -
> > > -       cpuidle_resume();
> > >  }
> > >
> > >  /**
> > > @@ -870,6 +868,7 @@ void dpm_resume_early(pm_message_t state)
> > >         }
> > >         mutex_unlock(&dpm_list_mtx);
> > >         async_synchronize_full();
> > > +       cpuidle_resume();
> > >         dpm_show_time(starttime, state, 0, "early");
> > >         trace_suspend_resume(TPS("dpm_resume_early"), state.event, false);
> > >  }
> > > @@ -1336,8 +1335,6 @@ int dpm_suspend_noirq(pm_message_t state)
> > >  {
> > >         int ret;
> > >
> > > -       cpuidle_pause();
> > > -
> > >         device_wakeup_arm_wake_irqs();
> > >         suspend_device_irqs();
> > >
> > > @@ -1467,6 +1464,7 @@ int dpm_suspend_late(pm_message_t state)
> > >         int error = 0;
> > >
> > >         trace_suspend_resume(TPS("dpm_suspend_late"), state.event, true);
> > > +       cpuidle_pause();
> > >         mutex_lock(&dpm_list_mtx);
> > >         pm_transition = state;
> > >         async_error = 0;
> > > --
> >
> > Well, this is somewhat heavy-handed and it affects even the systems
> > that don't really need to pause cpuidle at all in the suspend path.
>
> Yes, I agree.
>
> Although, I am not really changing the behaviour in regards to this.
> cpuidle_pause() is already being called in dpm_suspend_noirq(), for
> everybody today.

Yes, it is, but pausing it earlier will cause more energy to be spent,
potentially.

That said, there are not too many users of suspend_late callbacks in
the tree, so it may not matter too much.

> >
> > Also, IIUC you don't need to pause cpuidle completely, but make it
> > temporarily avoid idle states potentially affected by this issue.  An
> > additional CPUIDLE_STATE_DISABLED_ flag could be used for that I
> > suppose and it could be set via cpuidle_suspend() called from the core
> > next to cpufreq_suspend().
>
> cpuidle_suspend() would then need to go and fetch the cpuidle driver
> instance, which in some cases is one driver per CPU. Doesn't that get
> rather messy?

Per-CPU variables are used for that, so it is quite straightforward.

> Additionally, since find_deepest_state() is being called for
> cpuidle_enter_s2idle() too, we would need to treat the new
> CPUIDLE_STATE_DISABLED_ flag in a special way, right?

No, it already checks "disabled".

> Is this really what we want?
>
> >
> > The other guys who rely on the cpuidle pausing today could be switched
> > over to this new mechanism later and it would be possible to get rid
> > of the pausing from the system suspend path completely.
>
> Avoiding to pause cpuidle when it's not needed makes perfect sense.
> Although, it looks to me that we could also implement that on top of
> $subject patch.

Yes, it could.

> Unless you insist on the CPUIDLE_STATE_DISABLED_ way, I would probably
> explore an option to let a cpuidle driver to set a global cpuidle flag
> during ->probe(). Depending if this flag is set, we can simply skip
> calling cpuidle_pause() during system suspend.
>
> What do you think?

Well, which driver in particular is in question here?



More information about the linux-arm-kernel mailing list