The volatile option is great for "ephemeral" containers. Unfortunately, it doesn't capture all uses. There are two ways to use it safely right now: 1. Throw away the entire upperdir between mounts 2. Manually syncfs between mounts For certain use-cases like serverless, or short-lived containers, it is advantageous to be able to stop the container (runtime) and start it up on demand / invocation of the function. Usually, there is some bootstrap process which involves downloading some artifacts, or putting secrets on disk, and then upon invocation of the function, you want to (re)start the container. If you have to syncfs every time you do this, it can lead to excess filesystem overhead for all of the other containers on the machine, and stall out every container who's upperdir is on the same underlying filesystem, unless your filesystem offers something like subvolumes, and if sync can be restricted to a subvolume. The kernel has information that it can use to determine whether or not this is safe -- primarily if the underlying FS has had writeback errors or not. Overlayfs doesn't delay writes, so the consistency of the upperdir is not contingent on the mount of overlayfs, but rather the mount of the underlying filesystem. It can also make sure the underlying filesystem wasn't remounted. Although, it was suggested that we use derive this information from the upperdir's inode[1], we can checkpoint this data on disk in an xattr. Specifically we checkpoint: * Superblock "id": This is a new concept introduced in one of the patches which keeps track of (re)mounts of filesystems, by having a per boot monotonically increasing integer identifying the superblock. This is safer than trying to obfuscate the pointer and putting it into an xattr (due to leak risk, and address reuse), and during the course of a boot, the u64 should not wrap. * Overlay "boot id": This is a new UUID that is overlayfs specific, as overlayfs is a module that's independent from the rest of the system and can be (re)loaded independently -- thus it generates a UUID at load time which can be used to uniquely identify it. * upperdir / workdir errseq: A sample of the errseq_t on the workdir / upperdir's superblock. Since the errseq_t is implemented as a u32 with errno + error counter, we can safely store it in a checkpoint. [1]: https://lore.kernel.org/linux-unionfs/CAOQ4uxhadzC3-kh-igfxv3pAmC3ocDtAQTxByu4hrn8KtZuieQ@xxxxxxxxxxxxxx/ Sargun Dhillon (3): fs: Add s_instance_id field to superblock for unique identification overlay: Add ovl_do_getxattr helper overlay: Add the ability to remount volatile directories when safe Documentation/filesystems/overlayfs.rst | 5 +- fs/overlayfs/overlayfs.h | 43 +++++++++++++ fs/overlayfs/readdir.c | 86 +++++++++++++++++++++++-- fs/overlayfs/super.c | 22 ++++++- fs/super.c | 3 + include/linux/fs.h | 7 ++ include/uapi/linux/fs.h | 2 + 7 files changed, 160 insertions(+), 8 deletions(-) -- 2.25.1