On April 3, 2015 1:20:35 AM CDT, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: >On Thu, Apr 02, 2015 at 08:56:23PM -0500, Eric W. Biederman wrote: >> +static void mark_violated_mounts(struct dentry *dentry, struct >dentry *target) >> +{ >> + /* Mark all mountroots that are ancestors of dentry >> + * that do not share a common ancestor with target >> + * >> + * This function assumes both dentries are part of a DAG. >> + */ >> + struct dentry *p; >> + >> + for (p = dentry->d_parent; !IS_ROOT(p); p = p->d_parent) { >> + if (!d_mountroot(p)) >> + continue; >> + >> + if (d_ancestor(p, target)) >> + break; > >Egads... You do realize that you'll keep walking the path from target >to >root again and again? It's a tree and we have already walked up to the >root. >So we know the depths and tree topology can't change due to the >rename_lock >being held by caller. In the common case when a mount is not violated we will either not encounter a mount root. Or the mount root will be a common ancestor and so we will break out of the loop. If we can make the pathological cases perform better I am all for it. But I did not see anything immediately obvious. >> + if (!IS_ROOT(dentry) && !IS_ROOT(target)) { >> + mark_violated_mounts(dentry, target); >> + mark_violated_mounts(target, dentry); >> + } >> + >> dentry_lock_for_move(dentry, target); > >> +void mnt_set_violated(struct dentry *root, struct dentry *moving) >> +{ >> + struct mountroot *mr; >> + struct mount *mnt; >> + >> + lock_mount_hash(); >> + mr = lookup_mountroot(root); >> + if (!mr) >> + goto out; >> + >> + hlist_for_each_entry(mnt, &mr->r_list, mnt_mr_list) { >> + struct mount *child; >> + /* Be wary of this mount */ >> + mnt->mnt.mnt_flags |= MNT_VIOLATED; >> + >> + list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) { >> + /* Ignore children that will continue to be connected */ >> + if ((child->mnt_mountpoint != moving) && >> + !d_ancestor(moving, child->mnt_mountpoint)) >> + continue; >> + >> + /* Deal with mounts loosing the connection to >> + * their parents >> + */ >> + if (!(child->mnt.mnt_flags & MNT_UMOUNT)) { >> + child->mnt.mnt_flags |= MNT_UNREACHABLE_PARENT; >> + hlist_add_head(&child->mnt_pending_umount, &pending_umount); >> + schedule_work(&pending_umount_work); >> + } else { >> + umount_mnt(child); >> + } >> + } >> + } >> +out: >> + unlock_mount_hash(); > >And that can have non-trivial security implications - ability to expose >something that has been overmounted is potentially very nasty. I might >be missing something in the rest of the series (I'm half-asleep right >now, >so that's certainly possible), but that doesn't look obviously safe. >Note that it's very different from the situation with >umount-on-invalidation - >there the thing we are uncovering is dead. In this one it might be >very >much alive and deliberately covered... You are correct. I overlooked that corner case. It is hard to reach but not impossible. It is not fundamental to the patchset that the unmount happen. So just taking out the unmount out should work. I am wondering if we could performance some kind of weird half unmount like I do for submounts of what I am unmounting, but I do not think so. I will have to take a look when I get back from the long weekend. At this point I expect the patches are close enough it should not be hard to iterate to a workable fix. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html