Re: [git pull] mount API series

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Mon, 12 Nov 2018 20:54:39 +0000

On Sun, Nov 11, 2018 at 08:07:20PM -0600, Eric W. Biederman wrote:
> Steven Whitehouse <swhiteho@xxxxxxxxxx> writes:

> > Can you share some details of what this NULL dereference is? David and
> > Al have been working on the changes as requested by Linus later in
> > this thread, and they'd like to tidy up this issue too at the same
> > time if possible. We are not asking you to actually provide a fix, in
> > case you are too busy to do so, however it would be good to know what
> > the issue is so that we can make sure that it is resolved in the next
> > round of patches,
> 
> https://lore.kernel.org/lkml/87bm7n5k1r.fsf@xxxxxxxxxxxx/

Thought it had been dealt with, but you are right - oops is still there
and obviously needs fixing.  However, looking at that place in mainline...
How does that thing manage to work?  Look:
        /* Notice when we are propagating across user namespaces */
        if (m->mnt_ns->user_ns != user_ns)
                type |= CL_UNPRIVILEGED;
        child = copy_tree(last_source, last_source->mnt.mnt_root, type);
        if (IS_ERR(child))
                return PTR_ERR(child);
        child->mnt.mnt_flags &= ~MNT_LOCKED;
        mnt_set_mountpoint(m, mp, child);
        last_dest = m;
        last_source = child;
OK, we'd created the copy to be attached to the next candidate mountpoint.
If we have e.g. a 4-element peer group, we'll start with what we'd been
asked to mount, then call that sucker three times, getting a copy for
each of those mountpoints.  Right?  Now, what happens if the 1st, 3rd and
4th members live in your namespace, with the second one being elsewhere?
We have
	source_mnt: that'll go on top of the 1st mountpoint
	copy of source_mnt: that'll go on top of the 2nd mountpoint
	copy of copy of source_mnt: that'll go on top of the 3rd mountpoint
	copy of copy of copy of source_mnt: that'll go on top of the 4th one
And AFAICS your logics there has just made sure that everything except the
source_mnt will have MNT_LOCK_... all over the place.  IOW, the effect of
CL_UNPRIVELEGED is cumulative.

How the hell does that code avoid this randomness?  Note had the members of
that peer group been in a different order, you would've gotten a different result.
What am I missing here?

Oops is a separate story, and a regression in its own right; it needs to be
fixed.  But I would really like to sort out the semantics of the existing
code in that area, so that we don't end up with patch conflicts.