Re: [RFC] [PATCH 0/4] uid_ns: introduction

"Serge E. Hallyn" <serue@xxxxxxxxxx> · Thu, 9 Nov 2006 11:35:49 -0600

Quoting Herbert Poetzl (herbert@xxxxxxxxxxxx):
> On Thu, Nov 09, 2006 at 10:50:09AM -0600, Serge E. Hallyn wrote:
> > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx):
> > > "Serge E. Hallyn" <serue@xxxxxxxxxx> writes:
> > > 
> > > > So from your pov the same objection would apply to tagging vfsmounts,
> > > > or not?
> > > 
> > > No.  The issue is that the NFS server merges different mounts to the
> > > same nfs server into the same superblock.
> > > 
> > > > What is the scenario where the caching is broken? It can't be
> > > > multiple clients accessing the same NFS export from the same NFS
> > > > service container, since that would just be an erroneous setup,
> > > > right?
> > > 
> > > >
> > > >> > As I recall there are two basic issues.
> > > >> > 
> > > >> > Putting the default on the mount structure instead of the
> > > >> > superblock for filesystems that are not uid namespaces aware
> > > >> > sounded reasonable, and allowed certain classes of sharing
> > > >> > between namespaces where they agreed on a subset of the uids
> > > >> > (especially for read-only data).
> > > >> 
> > > >> yes, that is especially interesting for --bind mounts
> > > >> when you 'know' that you will dedicate a certain 
> > > >> sub-tree to one context/guest
> > > >
> > > > Ok, so you wouldn't object to a patch which tagged vfsmounts?
> > > >
> > > > I guess a NULL vfsmnt->user_ns pointer would mean ignore user_ns and
> > > > only apply uid checks (useful for ro bind mount of /usr into multiple
> > > > containers).
> > > 
> > > Bind mounts are peculiar.  But I think as long as you charged 
> > > the to the context in which they happen (don't do the bind 
> > > until after you switch the user_ns.  You should be fine.
> > 
> > Presumably container setup would be somewhat like system boot - you'd
> > start with a shared / filesystem, unshare user namespace, construct your
> > new /, pivot_root, and unmount /old_root, so you end up with all
> > vfsmounts accessible from the container having the correct user_ns.
> 
> well, once again that is a very narrow view to the

why thanks

> real picture, what about the following cases:
> 
>  - folks who _share_ certain filesystems between different
>    guests (maybe for cooperation or just readonly to save
>    resource)

They can just mount --bind the same tree into multiple containers.
Or, they can use a shared filesystem like the initial /.  (I intend for
vfsmount->mnt_user_ns == NULL to mean ignore user namespace checks.)

>  - folks who still want a way to access and or
>    andminsitrate the guests (without going through
>    ssh or whatever, e.g. for bulk updates)

In addition to the shared mounts, Cedric has a bind_ns which lets you
enter another namespace.  I think he's sent that patch out to the
containers list, but if he hasn't I expect he will be soon.

>  - prestructured setups (like build roots) which require
>    pre configured mounts to work ...

i don't see why having the container setup script set these
up is a restriction here.

-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html