Hi Eric, On Fri, 2011-05-06 at 19:24 -0700, Eric W. Biederman wrote: > With the networking stack today there is demand to handle > multiple network stacks at a time. Not in the context > of containers but in the context of people doing interesting > things with routing. > > There is also demand in the context of containers to have > an efficient way to execute some code in the container itself. > If nothing else it is very useful ad a debugging technique. > > Both problems can be solved by starting some form of login > daemon in the namespaces people want access to, or you > can play games by ptracing a process and getting the > traced process to do things you want it to do. However > it turns out that a login daemon or a ptrace puppet > controller are more code, they are more prone to > failure, and generally they are less efficient than > simply changing the namespace of a process to a > specified one. > > Pieces of this puzzle can also be solved by instead of > coming up with a general purpose system call coming up > with targed system calls perhaps socketat that solve > a subset of the larger problem. Overall that appears > to be more work for less reward. > > int setns(int fd, int nstype); > > The fd argument is a file descriptor referring to a proc > file of the namespace you want to switch the process to. > > In the setns system call the nstype is 0 or specifies > an clone flag of the namespace you intend to change > to prevent changing a namespace unintentionally. I don't understand exactly what the nstype argument buys us - why would correct code ever need to specify a value other than 0? And reusing the CLONE_NEW* values in this interface is kind of ugly when setns is precisely _not_ creating new namespaces. Is there some fundamental reason it couldn't be int setns(int fd); or is there a use case I'm missing? > +SYSCALL_DEFINE2(setns, int, fd, int, nstype) > +{ > + const struct proc_ns_operations *ops; > + struct task_struct *tsk = current; > + struct nsproxy *new_nsproxy; > + struct proc_inode *ei; > + struct file *file; > + int err; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + file = proc_ns_fget(fd); > + if (IS_ERR(file)) > + return PTR_ERR(file); > + > + err = -EINVAL; > + ei = PROC_I(file->f_dentry->d_inode); > + ops = ei->ns_ops; > + if (nstype && (ops->type != nstype)) > + goto out; > + > + new_nsproxy = create_new_namespaces(0, tsk, tsk->fs); create_new_namespaces() can fail; shouldn't this be checked? > + err = ops->install(new_nsproxy, ei->ns); > + if (err) { > + free_nsproxy(new_nsproxy); > + goto out; > + } > + switch_task_namespaces(tsk, new_nsproxy); > +out: > + fput(file); > + return err; > +} > + -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html