Forcing devices into idle

Thu Jul 3 06:32:24 PDT 2025

On Thu, Jul 03, 2025 at 02:06:00PM +0200, Thierry Reding wrote:
> On Thu, Jul 03, 2025 at 01:12:15PM +0200, Rafael J. Wysocki wrote:
> > On Thu, Jul 3, 2025 at 12:33 PM Oliver Neukum <oneukum at suse.com> wrote:
> > >
> > > On 03.07.25 12:08, Thierry Reding wrote:
> > >
> > > > Any thoughts on how to solve this? Is the pm_runtime_{put,get}_sync()
> > > > method acceptable? If not, are there other alternatives to achieve the
> > > > same thing that I'm not aware of? Would it be useful to add a new set of
> > > > APIs to force devices into an idle state (which could be semantically
> > > > different from runtime suspend)? Or is this all too specific for any
> > > > kind of generic API?
> > >
> > > Basically what you need is what happens when the system prepares to
> > > do a snapshot for S4. However, if you just perform FREEZE and then THAW,
> > > devices will assume that user space has been frozen. You need a way
> > > to substitute for that assumption.
> > 
> > Well, you just need to freeze user space beforehand.
> 
> Freezing userspace seems a bit heavy-handed. There's only a very few
> devices that need to be put into reset (such as the GPU), so most of
> userspace should be fine to continue to run. For the GPU the idea is
> to block all incoming requests while the device is forced into idle,
> and then to resume processing these requests after the VPR resize.
> 
> But maybe freezing userspace isn't the heavy operation that I think it
> is. Ideally we do not want to suspend things for too long to avoid
> stuttering on the userspace side.
> 
> Also, I think we'd want the freezing to be triggered by the VPR driver
> because userspace ideally doesn't know when the resizing happens. The
> DMA BUF heap API that I'm trying to use is too simple for that, and
> only the VPR driver knows when a resize needs to happen.
> 
> Is it possible to trigger the freeze from a kernel driver? Or localize
> the freezing of userspace to only the processes that are accessing a
> given device?
> 
> Other than that, freeze() and thaw() seem like the right callbacks for
> this.

I've prototyped this using the sledgehammer freeze_processes() and
thaw_processes() functions and the entire process seems to be pretty
quick. I can get through most of it in ~30 ms. This is on a mostly
idle test system, so I expect this to go up significantly if there
is a high load.

On the other hand, this will drastically simplify the GPU driver
implementation, because by the time ->freeze() is called, all userspace
will be frozen, so there's no need to do any blocking on the kernel
side.

What I have now is roughly this:

	freeze_processes();

	for each VPR device dev:
		pm_generic_freeze(dev);

	resize_vpr();

	for each VPR device dev:
		pm_generic_thaw(dev);

	thaw_processes()

I still can't shake the feeling that this is sketchy, but it seems to
work. Is there anything blatantly wrong about this?

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20250703/59540150/attachment.sig>