"Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes: > Hello Christian, > >>> All: I plan to add the following text to the manual page: >>> >>> new_root and put_old may be the same directory. In particular, >>> the following sequence allows a pivot-root operation without need‐ >>> ing to create and remove a temporary directory: >>> >>> chdir(new_root); >>> pivot_root(".", "."); >>> umount2(".", MNT_DETACH); >> >> Hm, should we mention that MS_PRIVATE or MS_SLAVE is usually needed >> before the umount2()? Especially for the container case... I think we >> discussed this briefly yesterday in person. > Thanks for noticing. That detail (more precisely: not MS_SHARED) is > already covered in the numerous other changes that I have pending > for this page: > > The following restrictions apply: > ... > - The propagation type of new_root and its parent mount must not > be MS_SHARED; similarly, if put_old is an existing mount point, > its propagation type must not be MS_SHARED. Ugh. That is close but not quite correct. A better explanation: The pivot_root system call will never propagate any changes it makes. The pivot_root system call ensures this is safe by verifying that none of put_old, the parent of new_root, and parent of the root directory have a propagation type of MS_SHARED. > The concern from our conversation at the container mini-summit was that there is a pathology if in your initial mount namespace all of the mounts are marked MS_SHARED like systemd does (and is almost necessary if you are going to use mount propagation), that if new_root itself is MS_SHARED then unmounting the old_root could propagate. So I believe the desired sequence is: >>> chdir(new_root); +++ mount("", ".", MS_SLAVE | MS_REC, NULL); >>> pivot_root(".", "."); >>> umount2(".", MNT_DETACH); The change to new new_root could be either MS_SLAVE or MS_PRIVATE. So long as it is not MS_SHARED the mount won't propagate back to the parent mount namespace. Eric