We have been working on a new initial filesystem called initoverlayfs. It is a new filesystem that provides a more scalable approach to initial filesystems as opposed to just using initrds. We are writing this RFC to the systemd and dracut mailing lists (feel free to forward to UAPI group also) because although this solution works without changing the code in these projects, it operates in the same area as systemd, udev, dracut, etc. and uses these tools. Brief context: -------------- initoverlayfs by default uses transient overlays rather than tmpfs to create throwaway filesystems early in the boot sequence. Why? An initramfs has to be decompressed and copied to a tmpfs up front before it can be used. This results in a situation where you end up paying for every byte in an initrd in boot performance, even the ones you don't use in a given boot. This leads to a fear of using languages that result in larger binaries sizes early boot, reusing libraries, etc. In some cases, reimplemented minified versions of software components present in the rootfs are used. Alternatively, initoverlayfs uses erofs (with compression) and overlayfs to achieve this, so you only pay for the bytes you actually use. There is also increased pressure from certain industries like automotive, to start essential services in a boot sequence early. Requirements: ------------- An init system An initramfs building tool A device manager overlayfs Nothing that you wouldn't find in most Linux distributions today. Design: ------- Here is the boot sequence with initoverlayfs integrated, the mini-initramfs contains just enough to get storage drivers loaded and storage devices initialized. storage-init is a process that is not designed to replace init, it does just enough to initialize storage (performs a targeted udev trigger on storage), switches to initoverlayfs as root and then executes init. ``` fw -> bootloader -> kernel -> mini-initramfs -> initoverlayfs -> rootfs fw -> bootloader -> kernel -> storage-init -> init -----------------> ``` Benefits: --------- Scalability: You can put less emphasis on keeping this initial filesystem small as you will only pay for the bytes you read. This is probably the bigger picture than raw performance in the next point. Performance: As this minifies the initramfs to contain only the most basic storage initialization tasks, linux userspace starts earlier than it would using just initramfs alone. Leaving all the other software that require early throwaway filesystems to be executed in the initoverlayfs. In the case of a Raspberry Pi 4 with sd card, it leads to systemd starting ~300ms faster and in the case of a Raspberry Pi 4 with NVMe SSD drive over USB it leads to systemd starting ~500ms faster. There are some devices that by starting Linux userspace early, you can expose a slowly initializing storage driver, leading to a slower boot as with just an initramfs you mask this slow driver by spending this time on decompression and copying. But a computer is only as fast as it's slowest component, so if you care about super fast boots, you need to optimize your storage drivers. Flexibility: It is now easier to consider using fatter languages like Rust, etc. Using libraries like graphics libraries, camera libraries, libevent, glib, C++, etc. early boot can be considered. As you don't have to decompress and copy this data upfront. This leads to easier to maintain initrd software also, with more consolidation between rootfs impelmentations and initial filesystem implementations of components. Changes required in other projects: ----------------------------------- There are no major changes required in other projects. Tools like systemd-analyze might need to be updated to recognize this boot sequence more accurately, because it has no awareness of initoverlayfs. Future plans: ------------- We intend to propose this to Fedora, CentOS Stream, ostree and non-ostree variants as we continue this project. Feel free to try: ----------------- It should work on most standard 3 partition non-ostree Fedora and CentOS 9 installs (note: CentOS 9 kernel does not support erofs compression, so Fedora is a better playground today). It's still in alpha/beta state I guess. Although I successfully dogfood this on my laptop and we hard tried this on a couple of different pieces of hardware and VMs... Maybe run this on a non-critical piece of hardware or a VM for the next few weeks if you want to try :) git repo: https://github.com/containers/initoverlayfs Also checkout the README.md, there are some graphs and other information there: https://github.com/containers/initoverlayfs/blob/main/README.md rpm available in copr: dnf copr enable @centos-automotive-sig/next dnf install initoverlayfs initoverlayfs-install Is mise le meas/Regards, Eric Curtin