Hello Eric, Thanks for chiming in; I should have thought to CC you at the start. I have a question or two, below. On Mon, 9 Sep 2019 at 12:40, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes: > > > Hello Philipp, > > > > On Tue, 6 Aug 2019 at 10:12, Philipp Wendler <ml@xxxxxxxxxxxxxxxxx> wrote: > >> > >> Hello Michael, hello Aleksa, > >> > >> Am 05.08.19 um 14:29 schrieb Michael Kerrisk (man-pages): > >> > >> > On 8/5/19 12:36 PM, Aleksa Sarai wrote: > >> >> On 2019-08-01, Michael Kerrisk (man-pages) <mtk.manpages@xxxxxxxxx> wrote: > >> >>> I'd like to add some documentation about the pivot_root(".", ".") > >> >>> idea, but I have a doubt/question. In the lxc_pivot_root() code we > >> >>> have these steps > >> >>> > >> >>> oldroot = open("/", O_DIRECTORY | O_RDONLY | O_CLOEXEC); > >> >>> newroot = open(rootfs, O_DIRECTORY | O_RDONLY | O_CLOEXEC); > >> >>> > >> >>> fchdir(newroot); > >> >>> pivot_root(".", "."); > >> >>> > >> >>> fchdir(oldroot); // **** > >> >>> > >> >>> mount("", ".", "", MS_SLAVE | MS_REC, NULL); > >> >>> umount2(".", MNT_DETACH); > >> >> > >> >>> fchdir(newroot); // **** > >> >> > >> >> And this one is required because we are in @oldroot at this point, due > >> >> to the first fchdir(2). If we don't have the first one, then switching > >> >> from "." to "/" in the mount/umount2 calls should fix the issue. > >> > > >> > See my notes above for why I therefore think that the second fchdir() > >> > is also not needed (and therefore why switching from "." to "/" in the > >> > mount()/umount2() calls is unnecessary. > >> > > >> > Do you agree with my analysis? > >> > >> If both the second and third fchdir are not required, > >> then we do not need to bother with file descriptors at all, right? > > > > Exactly. > > > >> Indeed, my tests show that the following seems to work fine: > >> > >> chdir(rootfs) > >> pivot_root(".", ".") > >> umount2(".", MNT_DETACH) > > > > Thanks for the confirmation, That's also exactly what I tested. > > > >> I tested that with my own tool[1] that uses user namespaces and marks > >> everything MS_PRIVATE before, so I do not need the mount(MS_SLAVE) here. > >> > >> And it works the same with both umount2("/") and umount2("."). > > > > Yes. > > > >> Did I overlook something that makes the file descriptors required? > > > > No. > > > >> If not, wouldn't the above snippet make sense as example in the man page? > > > > I have exactly that snippet in a pending change for the manual page :-). > > I have just spotted this conversation and I expect if you are going > to use this example it is probably good to document what is going > on so that people can follow along. (Sounds reasonable.) > >> chdir(rootfs) > >> pivot_root(".", ".") > > At this point the mount stack should be: > old_root > new_root > rootfs In this context, what is 'rootfs'? The initramfs? At least, when I examine /proc/PID/mountinfo. When I look at the / mount point in /proc/PID/mountinfo, I see just old_root new_root But nothing below 'new_root'. So, I'm a little puzzled. By the way, why is 'old_root' stacked above 'new_root', do you know? I mean, in this scenario it turns out to be useful, but it's kind of the opposite from what I would have expected. (And if this was a deliverate design decision in pivot_root(), it was never made explicit.) > With "." and "/" pointing to new_root. > > >> umount2(".", MNT_DETACH) > > At this point resolving "." starts with new_root and follows up the > mount stack to old-root. Okay. > Ordinarily if you unmount "/" as is happening above you then need to > call chroot and possibly chdir to ensure neither "/" nor "." point to > somewhere other than the unmounted root filesystem. In this specific > case because "/" and "." resolve to new_root under the filesystem that is > being unmounted that all is well. s/that/then/ ? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/