Ram Pai <linuxram@xxxxxxxxxx> writes: >> @@ -359,12 +373,24 @@ int propagate_mount_busy(struct mount *mnt, int refcnt) >> >> for (m = propagation_next(parent, parent); m; >> m = propagation_next(m, parent)) { >> - child = __lookup_mnt_last(&m->mnt, mnt->mnt_mountpoint); >> - if (child && list_empty(&child->mnt_mounts) && >> - (ret = do_refcount_check(child, 1))) >> - break; >> + int count = 1; >> + child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); >> + if (!child) >> + continue; >> + >> + /* Is there exactly one mount on the child that covers >> + * it completely whose reference should be ignored? >> + */ >> + topper = find_topper(child); > > This is tricky. I understand it is trying to identify the case where a > mount got tucked-in because of propagation. But this will not > distinguish the case where a mount got over-mounted genuinely, not because of > propagation, but because of explicit user action. > > > example: > > case 1: (explicit user action) > B is a slave of A > mount something on A/a , it will propagate to B/a > and than mount something on B/a > > case 2: (tucked mount) > B is a slave of A > mount something on B/a > and than mount something on A/a > > Both case 1 and case 2 lead to the same mount configuration. > > > however 'umount A/a' in case 1 should fail. > and 'umount A/a' in case 2 should pass. > > Right? in other words, umounts of 'tucked mounts' should pass(case 2). > whereas umounts of mounts on which overmounts exist should > fail.(case 1) Looking at your example. I agree that case 1 will fail today. However my actual expectation would be for both mount configurations to behave the same. In both cases something has been explicitly mounted on B/a and something has propagated to B/a. In both cases the mount on top is what was explicitly mounted, and the mount below is what was propagated to B/a. I don't see why the order of operations should matter. > maybe we need a flag to identify tucked mounts? To preserve our exact current semantics yes. The mount configurations that are delibearately constructed that I am aware of are comparatively simple. I don't think anyone has even taken advantage of the shadow/side mounts at this point. I made a reasonable effort to find out and no one was even aware they existed. Much less what they were. And certainly no one I talked to could find code that used them. So I think we are fine with a very modest semantic change here. Especially one that appears to make the semantics more consistent and predictable. I also expect the checkpoint/restart folks will appreciate the change as by giving them options it will make it easier to reconstruct complicated mount trees. Eric >> + >> + if (do_refcount_check(child, count)) >> + return 1; >> } >> - return ret; >> + return 0; >> } >> >> /* >> @@ -381,7 +407,7 @@ void propagate_mount_unlock(struct mount *mnt) >> >> for (m = propagation_next(parent, parent); m; >> m = propagation_next(m, parent)) { >> - child = __lookup_mnt_last(&m->mnt, mnt->mnt_mountpoint); >> + child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); >> if (child) >> child->mnt.mnt_flags &= ~MNT_LOCKED; >> } >> @@ -399,9 +425,11 @@ static void mark_umount_candidates(struct mount *mnt) >> >> for (m = propagation_next(parent, parent); m; >> m = propagation_next(m, parent)) { >> - struct mount *child = __lookup_mnt_last(&m->mnt, >> + struct mount *child = __lookup_mnt(&m->mnt, >> mnt->mnt_mountpoint); >> - if (child && (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m))) { >> + if (!child || (child->mnt.mnt_flags & MNT_UMOUNT)) >> + continue; >> + if (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m)) { >> SET_MNT_MARK(child); >> } >> } >> @@ -420,8 +448,8 @@ static void __propagate_umount(struct mount *mnt) >> >> for (m = propagation_next(parent, parent); m; >> m = propagation_next(m, parent)) { >> - >> - struct mount *child = __lookup_mnt_last(&m->mnt, >> + struct mount *topper; >> + struct mount *child = __lookup_mnt(&m->mnt, >> mnt->mnt_mountpoint); >> /* >> * umount the child only if the child has no children >> @@ -430,6 +458,15 @@ static void __propagate_umount(struct mount *mnt) >> if (!child || !IS_MNT_MARKED(child)) >> continue; >> CLEAR_MNT_MARK(child); >> + >> + /* If there is exactly one mount covering all of child >> + * replace child with that mount. >> + */ >> + topper = find_topper(child); >> + if (topper) >> + mnt_change_mountpoint(child->mnt_parent, child->mnt_mp, >> + topper); >> + >> if (list_empty(&child->mnt_mounts)) { >> list_del_init(&child->mnt_child); >> child->mnt.mnt_flags |= MNT_UMOUNT; >> diff --git a/fs/pnode.h b/fs/pnode.h >> index 550f5a8b4fcf..dc87e65becd2 100644 >> --- a/fs/pnode.h >> +++ b/fs/pnode.h >> @@ -49,6 +49,8 @@ int get_dominating_id(struct mount *mnt, const struct path *root); >> unsigned int mnt_get_count(struct mount *mnt); >> void mnt_set_mountpoint(struct mount *, struct mountpoint *, >> struct mount *); >> +void mnt_change_mountpoint(struct mount *parent, struct mountpoint *mp, >> + struct mount *mnt); >> struct mount *copy_tree(struct mount *, struct dentry *, int); >> bool is_path_reachable(struct mount *, struct dentry *, >> const struct path *root); >> -- >> 2.10.1 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html