Hello! On Tue, May 25, 2021 at 6:58 AM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > In fact, the network namespace is just a case for the problem that the > > 'mount leak' caused. And this kind modification is not friendly to > > current docker users, it makes great changes to the usage of docker. > > You mean an upgrade of docker? If so... that does not seem like a > definitive reason to do something new in the kernel *always*. No, I means that in this solution, users have to create all the containers first, and create the network namespace for them later. > > However, if you introduce it as a kconfig option so that users > who want to use this new feature can enable it, and then use it, > the its sold as a new feature. > > Should this always be enabled, or done this way? Should we never have > the option to revert back to the old behaviour? If not, why not? > This change seems transparent to users, which don't change the behavior of initramfs. However, it seems more reasonable to make it a kconfig option. I'll do it in the v2 of the three patches I sended. > > I don't see you using any context directly, where are you specifying the > context directly? > > > so it is not needed any more > > and I just removed it in init_rootfs(). > > > > The initialization of 'user_root' in init_rootfs() is used in: > > do_populate_rootfs -> mount_user_root, which set the file system( > > ramfs or tmpfs) of the second mount. > > > > Seems it's not suitable to place it in init_rootfs()...... > > OK I think I just need to understand how you added the context of the > first mount explicitly now and where, as I don't see it. > ...... > > What do you mean? init_mount_tree() is always called, and it has > statically: > > static void __init init_mount_tree(void) > { > struct vfsmount *mnt; > ... > mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL); > ... > } > > And as I noted, this is *always* called earlier than > do_populate_rootfs(). Your changes did not remove the init_mount_tree() > or modify it, and so why would the context of the above call always > be OK to be used now with a ramfs context now? > > > So it makes no sense to make the file system of the first mount selectable. > > Why? I don't see why, nor is it explained, we're always caling > vfs_kern_mount(&rootfs_fs_type, ...) and you have not changed that > either. > > > To simplify the code here, I make it ramfs_init_fs_context directly. In fact, > > it's fine to make it shmen_init_fs_context here too. > > So indeed you're suggesting its arbitrary now.... I don't see why. > So the biggest problem now seems to be the first mount I changed, maybe I didn't make it clear before. Let's call the first mount which is created in init_mount_tree() the 'init_mount'. If the 'root' is a block fs, initrd or nfs, the 'init_mount' will be a ramfs, that seems clear, it can be seen from the enable of tmpfs: void __init init_rootfs(void) { if (IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] && (!root_fs_names || strstr(root_fs_names, "tmpfs"))) is_tmpfs = true; } That makes sense, because the 'init_mount' will not be the root file system, as the block fs or initrd or nfs that mounted later on '/root' will become the root by a call of init_chroot(). If the 'root' is initramfs, cpio will be unpacked init 'init_mount'. In this scene, the 'init_mount' can be ramfs or tmpfs, and it is the root file system. In my patch, I create a second mount which is mounted on the 'init_mount', let's call it 'user_mount'. cpio is unpacked into 'user_mount', and the 'user_mount' is made the root by 'init_chroot()' later. The 'user_mount' can be ramfs or tmpfs. So 'init_mount' is not the root any more, and it makes no sense to make it tmpfs, just like why it should be ramfs when 'root' is a block fs or initrd. 'init_mount' is created from 'rootfs_fs_type' in init_mount_tree(): static void __init init_mount_tree(void) { struct vfsmount *mnt; ... mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL); ... } I make the 'init_mount' to be ramfs by making 'rootfs_fs_type->init_fs_context' with 'ramfs_init_fs_context': struct file_system_type rootfs_fs_type = { .name = "rootfs", .init_fs_context = ramfs_init_fs_context, .kill_sb = kill_litter_super, }; I think the third patch that I sended has made it clear what I do to the 'init_mount'. Thanks! Menglong Dong