[RFC] Pre-seeded files/directories for UBIFS

Richard Weinberger richard at nod.at
Sat May 20 09:12:39 PDT 2017


Hi!

These days I had an interesting discussion with Christoph about overlayfs and
its burden. The main use-case of overlayfs in combination with UBIFS is having a
squashfs as lower and UBIFS as upper directory. Such that all changes to the
read-only squashfs go into UBIFS. Upon a factory reset all files within the
UBIFS will be removed  and the merged directory is clean again. Christoph argued
that such a functionality  could be achieved without overlayfs if the filesystem
supported something like pre-seeded  files or directories. This would lower
memory pressure and complexity.

Today I had a thought about this and I'm pretty sure we can implement this in
UBIFS with not too much effort. The basic idea is marking files or whole
directories as seed upon mkfs.ubifs time.
If these files will be changed at run-time, the original contents will stay on
the medium and at any time these files can be reverted back to their seed state.
This includes file contents, attributes and extended attributes. I can think of
an UBIFS specific IOCTL to put files into seed state and to revert them again.

Since UBIFS is already a pure copy-on-write filesystem, all we have to do is
teaching the index tree about seeds. We could add a flag to the UBIFS key which
indicates that the node behind this key is seeded.
i.e. file "foo" is seeded and the corresponding inode number is 0x1234,
then every key of every UBIFS node that belongs to that file will wear the new
flag UBIFS_SEED_KEY.
ubifs_ino_node: 0x1234 | UBIFS_INO_KEY | UBIFS_SEED_KEY
ubifs_data_node: 0x1234 | <BLOCK_NO> | UBIFS_DATA_KEY | UBIFS_SEED_KEY

The inode itself will have a flag which denotes whether this file is seeded and
whether some modifications have happened. This will allow us to lookup directly
in the index tree with UBIFS_SEED_KEY set or no. Otherwise we'd have to do two
lookups every time.

If a seeded node faces a modification it will stay referenced in the index tree
and a copy without UBIFS_SEED_KEY is made. Upon next lookup the new node will
be used automatically. Reverting to the original state means purging all nodes
that don't have the UBIFS_SEED_KEY flag.

There are corner cases to consider, mostly for lookup of data nodes.
Currently a missing data node denotes a hole in a file.
With seeded files a missing data node can also mean that we need to fall back
to a UBIFS_SEED_KEY lookup.

Storing UBIFS_SEED_KEY itself into the UBIFS key is also not trivial since
almost all bits of the 32bit tuple are in use. But I'm sure we find some way
to encode it.

Another thing to consider are seeded directory entries. This requires us to
implement our own whiteout mechanism. But this could also be re-used for
overlayfs whiteouts.

That said, I consider this feature as doable but not trivial.
Artem, Adrian, you designed UBIFS, what do you think?
Maybe I missed some major show-stopper. :)

Thanks,
//richard



More information about the linux-mtd mailing list