Re: [RFC][PATCH 0/8][v2]: Enable multiple mounts of devpts

"Serge E. Hallyn" <serue@xxxxxxxxxx> · Wed, 3 Sep 2008 08:41:59 -0500

Quoting Cedric Le Goater (clg@xxxxxxxxxx):
> Serge E. Hallyn wrote:
> > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx):
> >> "Serge E. Hallyn" <serue@xxxxxxxxxx> writes:
> >>
> >>>>     (3.2) mnt namespace maybe ?
> >>> I think the last one is the way to go.
> >>>
> >>> mnt_namespace points to mq_ns.
> >>>
> >>> At clone(CLONE_NEWMNT), the new mnt namespace receives a copy of the
> >>> parent's mq_ns.
> >>>
> >>> If a task does
> >>> 	mount -o newinstance -t mqueue none /dev/mqueue
> >>> then its current->nsproxy->mnt_namespace->mqns is switched
> >>> to point to a new instance of the mq_ns.
> >>>
> >>> mnt_ns->mq_ns has pointers to the sb (and hence root dentry) of the
> >>> devpts fs.
> >>>
> >>> When a task does mq_open(name, flag), then name is in the mqueuefs
> >>> found in current->nsproxy->mnt_namespace->mqns.
> >>>
> >>> But if a task does
> >>>
> >>> 	clone(CLONE_NEWMNT);
> >>> 	mount --move /dev/mqueue /oldmqueue
> >>> 	mount -o newinstance -t mqueue none /dev/mqueue
> >>>
> >>> then that task can find files for the old mqueuefs under
> >>> /oldmqueue, while mq_open() uses /dev/mqueue since that's
> >>> what it finds through its mnt_namespace.
> >> Serge if we can make the lookup a pure mount namespace operation
> >> i.e. a well known path.  Than I don't have any problems with it.
> >> Otherwise it looks like abuse of the mount namespace.
> > 
> > Why?
> > 
> > Actually it may work to just put mq_ns straight in the nsproxy.
> 
> ok. that's the path I was taking.
> 
> > So let's see:
> > 
> > 	mq_open(name, flag): opens name under the dentry pointed
> > 		to by current->nsproxy->mq_ns->mq_dentry
> > 	mount -t mqueue none /dev/mqueue: either returns -EBUSY
> > 		or just mounts current->nsproxy->mq_ns->mq_sb
> > 		under /dev/mqueue
> > 	mount -o newinstance -t mqueue none /dev/mqueue: mounts
> > 		 a new mq_ns instance under /dev/mqueue
> > 
> > While doing
> > 	mount --make-rshared /vs1
> > 	mount --bind /dev/mqueue /vs1/dev/mqueue
> > 	create_a_new_container_chrooted_at(/vs1)
> > 		mount -o newinstance -t mqueue none /dev/mqueue
> > would allow the host to see the child's /dev/mqueue under
> > /vs1/dev/mqueue while having its own mqueuefs continue to be
> > mounted under /dev/mqueue.
> 
> ok. complete isolation would require 2 steps. I guess this is
> acceptable because mq uses a fs

What do you mean, it would require 2 steps?  You mean umount followed
by a mount?

Not really, since /dev/mqueue never needed to be bind-mounted under
/vs1/dev/mqueue to begin with, so all the container has to do is
mount -o newinstance -t mqueue none /dev/mqueue  (while chrooted under
/vs1)

IMO two steps means unshare(CLONE_NEWIPC) and mount /dev/mqueue,
which is what the last patchset required.

> allowing the host to see the child's /dev/mqueue is also 'a nice 
> to have' feature. unfortunately, we can't do that for all namespaces,
> for sysvipc for example. So I'm wondering if we should allow it
> at all ?
> 
> Thanks,
> 
> C.
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers