Forcing devices into idle

Thu Jul 3 06:46:03 PDT 2025

On Thu, Jul 3, 2025 at 3:32 PM Thierry Reding <thierry.reding at gmail.com> wrote:
>
> On Thu, Jul 03, 2025 at 02:06:00PM +0200, Thierry Reding wrote:
> > On Thu, Jul 03, 2025 at 01:12:15PM +0200, Rafael J. Wysocki wrote:
> > > On Thu, Jul 3, 2025 at 12:33 PM Oliver Neukum <oneukum at suse.com> wrote:
> > > >
> > > > On 03.07.25 12:08, Thierry Reding wrote:
> > > >
> > > > > Any thoughts on how to solve this? Is the pm_runtime_{put,get}_sync()
> > > > > method acceptable? If not, are there other alternatives to achieve the
> > > > > same thing that I'm not aware of? Would it be useful to add a new set of
> > > > > APIs to force devices into an idle state (which could be semantically
> > > > > different from runtime suspend)? Or is this all too specific for any
> > > > > kind of generic API?
> > > >
> > > > Basically what you need is what happens when the system prepares to
> > > > do a snapshot for S4. However, if you just perform FREEZE and then THAW,
> > > > devices will assume that user space has been frozen. You need a way
> > > > to substitute for that assumption.
> > >
> > > Well, you just need to freeze user space beforehand.
> >
> > Freezing userspace seems a bit heavy-handed. There's only a very few
> > devices that need to be put into reset (such as the GPU), so most of
> > userspace should be fine to continue to run. For the GPU the idea is
> > to block all incoming requests while the device is forced into idle,
> > and then to resume processing these requests after the VPR resize.
> >
> > But maybe freezing userspace isn't the heavy operation that I think it
> > is. Ideally we do not want to suspend things for too long to avoid
> > stuttering on the userspace side.
> >
> > Also, I think we'd want the freezing to be triggered by the VPR driver
> > because userspace ideally doesn't know when the resizing happens. The
> > DMA BUF heap API that I'm trying to use is too simple for that, and
> > only the VPR driver knows when a resize needs to happen.
> >
> > Is it possible to trigger the freeze from a kernel driver? Or localize
> > the freezing of userspace to only the processes that are accessing a
> > given device?
> >
> > Other than that, freeze() and thaw() seem like the right callbacks for
> > this.
>
> I've prototyped this using the sledgehammer freeze_processes() and
> thaw_processes() functions and the entire process seems to be pretty
> quick. I can get through most of it in ~30 ms. This is on a mostly
> idle test system, so I expect this to go up significantly if there
> is a high load.
>
> On the other hand, this will drastically simplify the GPU driver
> implementation, because by the time ->freeze() is called, all userspace
> will be frozen, so there's no need to do any blocking on the kernel
> side.
>
> What I have now is roughly this:
>
>         freeze_processes();
>
>         for each VPR device dev:
>                 pm_generic_freeze(dev);
>
>         resize_vpr();
>
>         for each VPR device dev:
>                 pm_generic_thaw(dev);
>
>         thaw_processes()
>
> I still can't shake the feeling that this is sketchy, but it seems to
> work. Is there anything blatantly wrong about this?

There are a few things to take into consideration.

First, there are 4 tiers of "freeze" callbacks (->prepare, ->freeze,
->freeze_late, ->freeze_noirq), and analogously for "thaw" callbacks,
but you only use one of them.  This may be fine in a particular case,
but you need to ensure that the other tiers are not needed and, in
particular, the _noirq ones need not be involved.  Also ensure that
they don't assume that PM notifiers have run (or that they will run on
the resume side).

Second, if there are dependencies between the devices being frozen and
other devices, they will have to be taken into account.

Also note that kernel threads are generally not affected by
freeze_processes(), but I guess this is not a problem in your use
case.

Thanks!