Menglong Dong <menglong8.dong@xxxxxxxxx> writes: > On Wed, May 26, 2021 at 2:50 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: >> > ...... >> >> What is the flow where docker uses an initramfs? >> >> Just thinking about this I am not being able to connect the dots. >> >> The way I imagine the world is that an initramfs will be used either >> when a linux system boots for the first time, or an initramfs would >> come from the distribution you are running inside a container. In >> neither case do I see docker being in a position to add functionality >> to the initramfs as docker is not responsible for it. >> >> Is docker doing something creating like running a container in a VM, >> and running some directly out of the initramfs, and wanting that code >> to exactly match the non-VM case? >> >> If that is the case I think the easy solution would be to actually use >> an actual ramdisk where pivot_root works. > > In fact, nowadays, initramfs is widely used by embedded devices in the > production environment, which makes the whole system run in ram. > > That make sense. First, running in ram will speed up the system. The size > of the system won't be too large for embedded devices, which makes this > idea work. Second, this will reduce the I/O of disk devices, which can > extend the life of the disk. Third, RAM is getting cheaper. > > So in this scene, Docker runs directly in initramfs. That is the piece of the puzzle I was missing. An small system with it's root in an initramfs. >> I really don't see why it makes sense for docker to be a special >> snowflake and require kernel features that no other distribution does. >> >> It might make sense to create a completely empty filesystem underneath >> an initramfs, and use that new rootfs as the unchanging root of the >> mount tree, if it can be done with a trivial amount of code, and >> generally make everything cleaner. >> >> As this change sits it looks like a lot of code to handle a problem >> in the implementation of docker. Which quite frankly will be a pain >> to have to maintain if this is not a clean general feature that >> other people can also use. >> > > I don't think that it's all for docker, pivot_root may be used by other > users in the above scene. It may work to create an empty filesystem, as you > mentioned above. But I don't think it's a good idea to make all users, > who want to use pivot_root, do that. After all, it's not friendly to > users. > > As for the code, it may look a lot, but it's not complex. Maybe a clean > up for the code I add can make it better? If we are going to do this something that is so small and clean it can be done unconditionally always. I will see if I can dig in and look at little more. I think there is a reason Al Viro and H. Peter Anvin implemeted initramfs this way. Perhaps it was just a desire to make pivot_root unnecessary. Container filesystem setup does throw a bit of a wrench in the works as unlike a initramfs where you can just delete everything there is not a clean way to get rid of a root filesystem you don't need without pivot_root. The net request as I understand it: Make the filesystem the initramfs lives in be an ordinary filesystem so it can just be used as the systems primary filesystem. There might be technical reasons why that is a bad idea and userspace would be requested to move everything into another ramfs manually (which would have the same effect). But it is take a good look to see if it can be accomplished cleanly. Eric