Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > On Wed, Jan 11, 2017 at 01:10:57PM +1300, Eric W. Biederman wrote: >> >> + if (child->mnt.mnt_root == smp->m_dentry) { >> > >> > Explain, please. In which case is that condition _not_ satisfied, and >> > what should happen i >> >> When a tree is grafted in that condition does not apply to the lower >> leaves of the tree. At the same time nothing needs to be done for those >> leaves. Only the primary mountpoint needs to worry about tucking. > > How in hell would those lower leaves end up on the list in > attach_recursive_mnt()? IDGI... The submounts of a mount tree that is being attached need to have commit_tree called on them to attach them to a mount namespace. > >> >> + if (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m)) { >> >> SET_MNT_MARK(child); >> > >> > Reread the condition, please... And yes, I realize that original is >> > also rather odd; at a guess it was meant to be !(, not (!, but that's >> > just a guess - it's your code, IIRC. >> >> The intent is to find all trees that we can unmount where the point >> at which the tree meets the rest of the mounts is not locked. >> >> The mark is later used to see if it ok to unmount a mount or if >> we will reveal information to userspace (by breaking a lock). >> >> Therefore the mark needs to be set if the mount is unlocked, >> and recursively the mark needs to be set for every child of >> that mount where the mark is set (the second condition). > > *blink* > > You have "if mount is not locked - mark it; if mount is already marked - > mark it again". The latter part (|| IS_MNT_MARKED(mnt), that is) looks > very odd, won't you agree? What the hell was that (its counterpart in > the earlier code) about? Not mark it again. If the parent is marked mark the child. This is about finding subtrees where the root of the subree is unlocked but the children may be locked to that root. Still we can safely unmount the entire subtree without revealing anything to userspace. The walk is designed to happen from parent to the child mounts. > I could understand something along the lines "mark it unless it's locked > or already marked", but your code is "mark it if it's not locked *or* > if it's already marked". Makes no sense in that form. > >> > FWIW, the my main worry here is your handling of the umount. For >> > example, what happens if >> > * something is mounted on A (m1) >> > * something else is mounted on A/bar (m2) >> > * D is a slave of C >> > * something has been mounted on D/foo (n) >> > * you do mount --rbind A C/foo (m1' on C/foo, m2' on m1'/bar, >> > m1'' interposed on D/foo under n, >> > m2'' on m1''/bar, >> > m1'' slave of m1', m2'' slave of m2) >> > * you make C/foo and C/foo/bar private (m1'' and m2'' are not getting >> > propagation from m1' and m2' anymore) >> > * you umount C/foo/bar (m2' is unmounted) >> > * you umount C/foo >> > m1' gets unmounted, all right, but what of m1''? D is a slave of C, so we >> > get propagation of umount from C/foo to D/foo; m1'' still has m2'' attached >> > to it. AFAICS, your logics will happily slip m1'' from under n (assuming >> > that n itself is not busy), and leak both m1'' and m2''. >> >> Yes. This is exactly the same behavior we have today without my patch. >> The only difference is who is the parent mount. > > Not quite. In the current tree m1'' should get stuck there (and be exposed > when n gets unmounted); AFAICS, your change will have it kicked out, with > m2'' still attached and still contributing to refcount of m1''. > > I might be missing something (and I hadn't checked your script - right now > I'm at 16 hours of uptime after only 4 hours of sleep). Will take a look > at that after I grab some sleep... Please look with fresh rested eyes. I will see about posting a new version of the patch shortly. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html