On Sat, Feb 04, 2017 at 07:26:20AM +1300, Eric W. Biederman wrote: > Ram Pai <linuxram@xxxxxxxxxx> writes: > > > On Fri, Feb 03, 2017 at 11:54:21PM +1300, Eric W. Biederman wrote: > >> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes: > >> > >> > Ram Pai <linuxram@xxxxxxxxxx> writes: > >> > > >> >> On Sat, Jan 21, 2017 at 05:15:29PM +1300, Eric W. Biederman wrote: > >> >>> Ram Pai <linuxram@xxxxxxxxxx> writes: > >> >>> > >> >>> >> @@ -359,12 +373,24 @@ int propagate_mount_busy(struct mount *mnt, int refcnt) > >> >>> >> > >> >>> >> for (m = propagation_next(parent, parent); m; > >> >>> >> m = propagation_next(m, parent)) { > >> >>> >> - child = __lookup_mnt_last(&m->mnt, mnt->mnt_mountpoint); > >> >>> >> - if (child && list_empty(&child->mnt_mounts) && > >> >>> >> - (ret = do_refcount_check(child, 1))) > >> >>> >> - break; > >> >>> >> + int count = 1; > >> >>> >> + child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); > >> >>> >> + if (!child) > >> >>> >> + continue; > >> >>> >> + > >> >>> >> + /* Is there exactly one mount on the child that covers > >> >>> >> + * it completely whose reference should be ignored? > >> >>> >> + */ > >> >>> >> + topper = find_topper(child); > >> >>> > > >> >>> > This is tricky. I understand it is trying to identify the case where a > >> >>> > mount got tucked-in because of propagation. But this will not > >> >>> > distinguish the case where a mount got over-mounted genuinely, not because of > >> >>> > propagation, but because of explicit user action. > >> >>> > > >> >>> > > >> >>> > example: > >> >>> > > >> >>> > case 1: (explicit user action) > >> >>> > B is a slave of A > >> >>> > mount something on A/a , it will propagate to B/a > >> >>> > and than mount something on B/a > >> >>> > > >> >>> > case 2: (tucked mount) > >> >>> > B is a slave of A > >> >>> > mount something on B/a > >> >>> > and than mount something on A/a > >> >>> > > >> >>> > Both case 1 and case 2 lead to the same mount configuration. > >> >>> > > >> >>> > > >> >>> > however 'umount A/a' in case 1 should fail. > >> >>> > and 'umount A/a' in case 2 should pass. > >> >>> > > >> >>> > Right? in other words, umounts of 'tucked mounts' should pass(case 2). > >> >>> > whereas umounts of mounts on which overmounts exist should > >> >>> > fail.(case 1) > >> >>> > >> >>> Looking at your example. I agree that case 1 will fail today. > >> >> > >> >> And should continue to fail. right? Your semantics change will pass it. > >> > > >> > I don't see why it should continue to fail. > >> > > >> >>> However my actual expectation would be for both mount configurations > >> >>> to behave the same. In both cases something has been explicitly mounted > >> >>> on B/a and something has propagated to B/a. In both cases the mount > >> >>> on top is what was explicitly mounted, and the mount below is what was > >> >>> propagated to B/a. > >> >>> > >> >>> I don't see why the order of operations should matter. > >> >> > >> >> One of the subtle expectation is reversibility. > >> >> > >> >> Mount followed immediately by unmount has always passed and that is the > >> >> standard expectation always. Your proposed code will ensure that. > >> >> > >> >> However there is one other subtle expectaton. > >> >> > >> >> A mount cannot disappear if a user has explicitly mounted on top of it. > >> >> > >> >> your proposed code will not meet that expectation. > >> >> > >> >> In other words, these two expectations make it behave differently even > >> >> when; arguably, they feel like the same configuration. > >> > > >> > I am not seeing that. > >> > > >> > > >> > > >> >>> > >> >>> > maybe we need a flag to identify tucked mounts? > >> >>> > >> >>> To preserve our exact current semantics yes. > >> >>> > >> >>> The mount configurations that are delibearately constructed that I am > >> >>> aware of are comparatively simple. I don't think anyone has even taken > >> >>> advantage of the shadow/side mounts at this point. I made a reasonable > >> >>> effort to find out and no one was even aware they existed. Much less > >> >>> what they were. And certainly no one I talked to could find code that > >> >>> used them. > >> >> > >> >> But someday; even if its after a decade, someone ;) will > >> >> stumble into this semantics and wonder 'why?'. Its better to get it right > >> >> sooner. Sorry, I am blaming myself; for keeping some of the problems > >> >> open thinking no one will bump into them. > >> > > >> > Oh definitely. If we have people ready to talk it through I am happy to > >> > dot as many i's and cross as many t's as we productively can. > >> > > >> > I was just pointing out that I don't have any reason to expect that any > >> > one depends on the subtle details of the implementation today so we > >> > still have some wiggle room to fix them. Even if they are visible to > >> > user space. > >> > >> So I haven't seen a reply, and we are getting awfully close to the merge > >> window. Is there anything concrete we can do to ease concerns? > >> > >> Right now I am thinking my last version of the patch is the likely the > >> best we have time and energy to manage and it would be good to merge it > >> before the code bit rots. > > > > I was waiting for some other opinions on the behavior, since I > > continue to think that 'one should not be able to unmount mounts on > > which a user has explicitly mounted upon'. I am happy to be overruled, > > since your patch significantly improves the rest of the semantics. > > > > Viro? > > Ram Pai, just to be clear you were hoping to add the logic below to my patch? Yes. the behavior of your patch below is what I was proposing. > > My objections to the snippet below are: > > - It makes it hard for the CRIU folks (yet more state they have to find > and restore). true. unfortunately one more subtle detail to be aware off. > > - It feels subjectively worse to me. > > - We already have cases where mounts are unmounted transparently (umount on rmdir). sorry. i am not aware of this case. some details will help. > > - Al Viro claims that the side/shadow mounts are ordinary mounts and > maintaining this extra logic that remembers if we tucked one mount > under another seems to make this them less ordinary. I tend to argue that they are a bit more than ordinary, for they have the ability to tuck. > > - The symmetry for unmounting exists for a tucked mount. We can unmount > it via propagation or we can unmount the mount above it, and then we > can unmount the new underlying mount. this is fine with me. > So I don't see why we don't > want symmetry in the other case just because we mounted on top of > the mount and rather than had the mount tucked under us. A tucked mount should be un-tuckable. I agree. But a non-tucked mount cannot pretend to be tucked and this is where I disagree. > > diff --git a/fs/namespace.c b/fs/namespace.c > index 8bfad42c1ccf..8b00e0548438 100644 > --- a/fs/namespace.c > +++ b/fs/namespace.c > @@ -2047,8 +2047,10 @@ static int attach_recursive_mnt(struct mount *source_mnt, > hlist_del_init(&child->mnt_hash); > q = __lookup_mnt(&child->mnt_parent->mnt, > child->mnt_mountpoint); > - if (q) > + if (q) { > mnt_change_mountpoint(child, smp, q); > + child->mnt.mnt_flags |= MNT_TUCKED; > + } > commit_tree(child); > } > put_mountpoint(smp); > diff --git a/fs/pnode.c b/fs/pnode.c > index 5bc7896d122a..e2a6ac68feb9 100644 > --- a/fs/pnode.c > +++ b/fs/pnode.c > @@ -327,6 +327,9 @@ static struct mount *find_topper(struct mount *mnt) > /* If there is exactly one mount covering mnt completely return it. */ > struct mount *child; > > + if (!(mnt->mnt.mnt_flags & MNT_TUCKED)) > + return NULL; > + > if (!list_is_singular(&mnt->mnt_mounts)) > return NULL; > > diff --git a/include/linux/mount.h b/include/linux/mount.h > index 8e0352af06b7..25ca398b19b3 100644 > --- a/include/linux/mount.h > +++ b/include/linux/mount.h > @@ -52,6 +52,7 @@ struct mnt_namespace; > > #define MNT_INTERNAL 0x4000 > > +#define MNT_TUCKED 0x020000 > #define MNT_LOCK_ATIME 0x040000 > #define MNT_LOCK_NOEXEC 0x080000 > #define MNT_LOCK_NOSUID 0x100000 > > Eric -- Ram Pai