On Thu, Jan 05, 2017 at 10:04:14AM +1300, Eric W. Biederman wrote: > - attach_shadows to be renamed __attach_mnt and the it's shadow > handling to be removed. Er... s/the it's/its/, presumably? Or am I misparsing that? > v2: Updated to mnt_change_mountpoint to not call dput or mntput > and instead to decrement the counts directly. It is guaranteed > that there will be other references when mnt_change_mountpoint is > called so this is safe. > +void mnt_change_mountpoint(struct mount *parent, struct mountpoint *mp, struct mount *mnt) > { Too generic name, IMO, and I really wonder if "mount" (== interpose) and "umount" (== excise?) cases would be better off separately. > + struct mountpoint *old_mp = mnt->mnt_mp; > + struct dentry *old_mountpoint = mnt->mnt_mountpoint; > + struct mount *old_parent = mnt->mnt_parent; > + > + list_del_init(&mnt->mnt_child); > + hlist_del_init(&mnt->mnt_mp_list); > + hlist_del_init_rcu(&mnt->mnt_hash); > + > + attach_mnt(mnt, parent, mp); > + > + put_mountpoint(old_mp); > + * > + * During mounting, another mount will continue to use the old > + * mountpoint and during unmounting, the old mountpoint will > + * continue to exist until namespace_unlock which happens well > + * after mnt_change_mountpoint. > + */ Umm... AFAICS, in the former case "another mount" is simply parent, right? > + spin_lock(&old_mountpoint->d_lock); > + old_mountpoint->d_lockref.count--; > + spin_unlock(&old_mountpoint->d_lock); > + mnt_add_count(old_parent, -1); > + if (child->mnt.mnt_root == smp->m_dentry) { Explain, please. In which case is that condition _not_ satisfied, and what should happen i > + struct mount *q; > + q = __lookup_mnt(&child->mnt_parent->mnt, > + child->mnt_mountpoint); > + if (q) > + mnt_change_mountpoint(child, smp, q); > unlock_mount_hash(); > + put_mountpoint(smp); Wrong order... > ns->pending_mounts = 0; > + put_mountpoint(smp); ... and worse yet here. > static inline int do_refcount_check(struct mount *mnt, int count) > { > + struct mount *topper = __lookup_mnt(&mnt->mnt, mnt->mnt.mnt_root); > + if (topper) > + count++; > return mnt_get_count(mnt) > count; > } > @@ -359,7 +362,7 @@ int propagate_mount_busy(struct mount *mnt, int refcnt) > > for (m = propagation_next(parent, parent); m; > m = propagation_next(m, parent)) { > - child = __lookup_mnt_last(&m->mnt, mnt->mnt_mountpoint); > + child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); > if (child && list_empty(&child->mnt_mounts) && > (ret = do_refcount_check(child, 1))) > break; Er... You do realize that you can end up with more that one such propagation, right? IOW, there might be more than one thing slipped in. > + if (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m)) { > SET_MNT_MARK(child); Reread the condition, please... And yes, I realize that original is also rather odd; at a guess it was meant to be !(, not (!, but that's just a guess - it's your code, IIRC. > @@ -420,8 +425,8 @@ static void __propagate_umount(struct mount *mnt) > > for (m = propagation_next(parent, parent); m; > m = propagation_next(m, parent)) { > - > - struct mount *child = __lookup_mnt_last(&m->mnt, > + struct mount *topper; > + struct mount *child = __lookup_mnt(&m->mnt, > mnt->mnt_mountpoint); > /* > * umount the child only if the child has no children > @@ -430,6 +435,16 @@ static void __propagate_umount(struct mount *mnt) > if (!child || !IS_MNT_MARKED(child)) > continue; > CLEAR_MNT_MARK(child); > + > + /* If there is exactly one mount covering all of child > + * replace child with that mount. > + */ > + topper = __lookup_mnt(&child->mnt, child->mnt.mnt_root); > + if (topper && > + (child->mnt_mounts.next == &topper->mnt_child) && > + (topper->mnt_child.next == &child->mnt_mounts)) Weird way to spell child->mnt_mounts.next == child->mnt_mounts.prev, that... Or, perhaps, the entire thing ought to be if (list_is_singular(&child->mnt_mounts)) { topper = list_first_entry(&child->mnt_mounts, struct mount, mnt_child); if (topper->mnt_parent == child && topped->mnt_mountpoint == child->mnt.mnt_root) to avoid hash lookups. > + mnt_change_mountpoint(child->mnt_parent, child->mnt_mp, topper); FWIW, the my main worry here is your handling of the umount. For example, what happens if * something is mounted on A (m1) * something else is mounted on A/bar (m2) * D is a slave of C * something has been mounted on D/foo (n) * you do mount --rbind A C/foo (m1' on C/foo, m2' on m1/bar, m1'' interposed on D/foo under n, m2'' on m1''/bar, m1'' slave of m1', m2'' slave of m2) * you make C/foo and C/foo/bar private (m1'' and m2'' are not getting propagation from m1' and m2' anymore) * you umount C/foo/bar (m2' is unmounted) * you umount C/foo m1' gets unmounted, all right, but what of m1''? D is a slave of C, so we get propagation of umount from C/foo to D/foo; m1'' still has m2'' attached to it. AFAICS, your logics will happily slip m1'' from under n (assuming that n itself is not busy), and leak both m1'' and m2''. OTOH, the case of multiple slip-under (Z is slave of Y, which is a slave of X, mount on Z, then mount of Y, then mount on X) the check for being busy would do very odd things. Something's fishy on the umount side... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html