Ram Pai <linuxram@xxxxxxxxxx> writes: > On Thu, Jan 12, 2017 at 05:18:12AM +1300, Eric W. Biederman wrote: >> >> When I look at what propagate_mount_busy is trying to do and I look >> at the code closely I discover there is a great disconnect between the >> two. In the ordinary non-propagation case propagate_mount_busy has >> been verifying that there are no submounts and that there are no >> extraneous references on the mount. >> >> For mounts that the unmount would propagate to propagate_mount_busy has >> been verifying that there are no extraneous references only if there >> are no submounts. Which is nonsense. > > > the reason why we had to do it that way was because there were > situations where it was impossible to umount anything... > > take for example. > > (1) mount --make-shared A > > (2) mount --bind A A/a > > The tree looks like this > > A > | > B > > (3) mount --bind A B/a > The tree looks like this > A > | > B B' (B' becomes a shadow mount) > | > C > > > (4) mount --make-slave A > At this point B and C are peers and A is a slave. > > (5) umount B' > NOTE: This used to be possible a decade ago if the process doing > the umount had access to its dentry. > The tree looks like this > A > | > B > | > C > > Now if you try to unmount C, it becomes impossible, reason being... > > B is the parent of C. > So the umount propagates to A. But A has B mounted at the same > location. But B is busy since it has got a child C. > So the entire umount has to fail. There is no way to umount it all. > Kind of stuck for ever. That is the reason; in those days a decade ago, > we relaxed the rule to let go propagated mounts that had children. > > The above example is a simplest case that demonstrates the phenomenon. > > Given that, the current code does not allow any process to reach shadow > mount B' and given that we are getting rid of shadow mounts, I think we > should allow the code changes you propose in this patch. Thank you very much for the good description of why propagate_mount_busy works the way it does. I just finished taking a hard look at this and in fact the current code does allow reaching B' via umount propagation. My other patch changes exactly how you have to reach it but it is still possible to umount B' At the same time those mounts have alwasy been unmountable with "umount -l" aka MOUNT_DETACH. Have you ever encountered a non-contrived situation that leads to this kind of problem? I expect if we can verify that docker, and systemd are similar pieces of the linux ecosystem are not depending on the exact details of the propagation of the umount busy we should be able to remove this. Last I looked the uses of mount and umount were all quite simple, so I think it is very possible to make this change. Especially as it is now much harder to get into the situation you describe. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html