On Tue, Jun 1, 2021 at 10:39 PM Christian Brauner <christian.brauner@xxxxxxxxxx> wrote: > [...] > > This code is duplicated below in this file > > void __init init_rootfs(void) > { > if (IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] && > (!root_fs_names || strstr(root_fs_names, "tmpfs"))) > is_tmpfs = true; > } > > so you should add a tiny inline helper that can be called in both > places. Will also allow you to get rid of one ifdef and makes the patch > smaller. Seems the code here is indeed duplicated, I'll replace it with an inline function. [...] > > + > > +/* > > + * The syscall 'pivot_root' is used to change root and it is able to > > + * clean the old mounts, which make it preferred by container platforms > > + * such as Docker. However, initramfs is not supported by pivot_root, > > + * and 'chroot()' has to be used, which is unable to clean the mounts > > + * that propagate from HOST. These useless mounts make the release of > > + * removable device or network namespace a big problem. > > + * > > + * To make initramfs supported by pivot_root, the mount of the root > > + * filesystem should have a parent, which will make it unmountable. In > > + * this function, the second mount, which is called 'user root', is > > + * created and mounted on '/root', and it will be made the root filesystem > > + * in end_mount_user_root() by init_chroot(). > > + * > > + * The 'user root' has a parent mount, which makes it unmountable and > > + * pivot_root work. > > + * > > + * What's more, root_mountflags and root_mount_data are used here, which > > + * makes the 'rootflags' in boot cmd work for 'user root'. > > I appreciate the detail but most of that should go in the commit > message it also repeats some info a couple of times. :) Here sm like the > following should suffice, I think: > > /* > * Give systems running from the initramfs and making use of pivot_root a > * proper mount so it can be umounted during pivot_root. > */ I added the comments here to make the folks understand what these changes are for, as LuisChamberlain suggested. Do you think that it is unnecessary and the commit message is enough for users to understand this function? [...] > > + > > static ssize_t __init xwrite(struct file *file, const char *p, size_t count, > > loff_t *pos) > > { > > @@ -682,15 +684,23 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) > > else > > printk(KERN_INFO "Unpacking initramfs...\n"); > > > > + init_user_rootfs(); > > + > > + if (mount_user_root()) > > I would call this sm like > > prepare_mount_rootfs() > finish_mount_rootfs() Yeah, this name seems better! I'll change it. > > > + panic("Failed to create user root"); > > I don't think you need to call init_user_rootfs() separately? You could > just move it into the prepare_mount_rootfs()/mount_user_root() call. After rename 'mount_user_root()' to 'prepare_mount_rootfs()', it seems reasonable to call 'init_user_rootfs()' in 'prepare_mount_rootfs()'. > > > + > > err = unpack_to_rootfs((char *)initrd_start, initrd_end - initrd_start); > > if (err) { > > + end_mount_user_root(false); > > This boolean argument to end_mount_user_root() is a bit strange. Just > call init_umount() directly here? I don't think it is suitable to call init_umount() directly here. Before umount, 'init_chdir("/")' should be called too, and it seems a little weird to do these stuff in do_populate_rootfs(). According to the result, finish_mount_rootfs() will change root to the new mount or fall back and do the clean work, it seems fine. > > > #ifdef CONFIG_BLK_DEV_RAM > > populate_initrd_image(err); > > #else > > printk(KERN_EMERG "Initramfs unpacking failed: %s\n", err); > > #endif > > + goto done; > > } > > > > + end_mount_user_root(true); > > done: > > /* > > * If the initrd region is overlapped with crashkernel reserved region, > > diff --git a/usr/Kconfig b/usr/Kconfig > > index 8bbcf699fe3b..f9c96de539c3 100644 > > --- a/usr/Kconfig > > +++ b/usr/Kconfig > > @@ -52,6 +52,16 @@ config INITRAMFS_ROOT_GID > > > > If you are not sure, leave it set to "0". > > > > +config INITRAMFS_USER_ROOT > > I think the naming isn't great. Just call it INITRAMFS_MOUNT. The "user" > part in all the function and variabe names seems confusing to me at > least it doesn't convey a lot of useful info. So I'd just drop it and > try to stick with plain rootfs/initramfs terminology. Yeah, it now appears that 'user' seems indeed weird here. I thought that this mount is created for user space, seems kernel is using it too. I'll clean these 'user'. I appreciate your detail comments, thank you! Thanks! Menglong Dong