[RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
Luis Chamberlain
mcgrof at kernel.org
Sat May 6 21:07:40 PDT 2023
On Thu, Feb 23, 2023 at 07:08:37PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> > Add support to automatically handle freezing and thawing filesystems
> > during the kernel's suspend/resume cycle.
> >
> > This is needed so that we properly really stop IO in flight without
> > races after userspace has been frozen. Without this we rely on
> > kthread freezing and its semantics are loose and error prone.
> > For instance, even though a kthread may use try_to_freeze() and end
> > up being frozen we have no way of being sure that everything that
> > has been spawned asynchronously from it (such as timers) have also
> > been stopped as well.
> >
> > A long term advantage of also adding filesystem freeze / thawing
> > supporting during suspend / hibernation is that long term we may
> > be able to eventually drop the kernel's thread freezing completely
> > as it was originally added to stop disk IO in flight as we hibernate
> > or suspend.
>
> Hooray!
>
> One evil question though --
>
> Say you have dm devices A and B. Each has a distinct fs on it.
> If you mount A and then B and initiate a suspend, that should result in
> first B and then A freezing, right?
>
> After resuming, you then change A's dm-table definition to point it
> at a loop device backed by a file on B.
>
> What happens now when you initiate a suspend? B freezes, then A tries
> to flush data to the loop-mounted file on B, but it's too late for that.
> That sounds like a deadlock?
>
> Though I don't know how much we care about this corner case,
As you suggest this is not the only corner case that one could draw
upon. There was that evil ioctl added years ago to allow flipping an
installed system bootted from a USB or ISO over to the real freshly
installed root mount point. To make this bullet-proof we'll need to
eventually add a simple graph implementation to keep tags on ordering
requirements for the super blocks. I have some C code which tries to
implement a graph Linux style but since these are all corner cases at
this time, I think it's best we fix first suspend for most and later
address a proper graph solution.
> Anyway, just wondering if you'd thought about that kind of doomsday
> scenario that a nutty sysadmin could set up.
>
> The only way I can think of to solve that kind of thing would be to hook
> filesystems and loop devices into the device model, make fs "device"
> suspend actually freeze, hope the suspend code suspends from the leaves
> inward, and hope I actually understand how the device model works (I
> don't.)
There's probably really odd things one can do, and one thing I think
we can later do is simply annotate those cases and *not* allow auto-freeze
with time for those horrible situations.
A real long term solution I think will involve a graph.
Luis
More information about the kexec
mailing list