Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eric W. Biederman wrote:
Daniel Lezcano <daniel.lezcano@xxxxxxx> writes:

Eric W. Biederman wrote:
Introduce two new system calls:
int nsfd(pid_t pid, unsigned long nstype);
int setns(unsigned long nstype, int fd);

These two new system calls address three specific problems that can
make namespaces hard to work with.
- Namespaces require a dedicated process to pin them in memory.
- It is not possible to use a namespace unless you are the
  child of the original creator.
- Namespaces don't have names that userspace can use to talk
  about them.

The nsfd() system call returns a file descriptor that can
be used to talk about a specific namespace, and to keep
the specified namespace alive.

The fd returned by nsfd() can be bind mounted as:
mount --bind /proc/self/fd/N /some/filesystem/path
to keep the namespace alive indefinitely as long as
it is mounted.

open works on the fd returned by nsfd() so another
process can get a hold of it and do interesting things.

Overall that allows for persistent naming of namespaces
according to userspace policy.

setns() allows changing the namespace of the current process
to a namespace that originates with nsfd().

Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
---
Is it planned to support all the namespaces for 'nsfd' ?
I mean will it be possible to specify an Or'ed combination of nstype to grab a
reference for several namespaces at a time of the targeted process ?

for example : nsfd( 1234, NSTYPE_NET | NSTYPE_IPC, NSTYPE_MNT)

No, the plan is only one namespace at a time.

It would not be much of a change to support multiple namespaces,
but I don't think I want to go there.  Bitmaps filling up are
ugly and I don't see what would be gained.
The idea I had in mind when I asked this question was if we can "move" a process inside a container, aka a set of namespaces :)
I does make sense to support all of the namespaces we can support
with unshare, but with nstype as an enumeration not as a bitmap.
I suppose when you say "to support all of the namespaces we can support with *unshare*", you exclude the pid namespace which is created only with clone, right ? Do you think we can extend the concept to all the namespaces including the pid_namespace ?

This is slightly better than the earlier version that used a netlink
socket as the reference as I can give it the semantics of a deleted
file and only when that file goes away drop the reference on the
namespace.  It is also better in that this interface can support all
of the namespaces, without adding yet another syscall.
I like the idea :)

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux