Re: [PATCH 1/2] setns.2: Initial man page [RESEND]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michael Kerrisk <mtk.manpages@xxxxxxxxx> writes:

> Hi Eric,
>
> I'm still wanting your input on the edited setns.2 draft below. Please
> don't make me chase you round Prague ;-).

That could be interesting...  As I don't have plans to head out that way
this year.  I got side tracked with some unexpected computer troubles
that showed up right after I got home.

So overall it looks good.  I found two nits to pick (see below).

The significant nit is how do we say unshare and setns refer
to just a linux task and not the entire process.

When you are writing multi-threaded apps it actually matters.

In particular I keep expecting someone will need a call like:

int socketat(int namespace, int domain, int type, int protocol)
{
        int netns, ret, fd;
        netns = open("/proc/self/ns/net", O_RDONLY);
        if (netns < 0)
        	return -1;
        ret = setns( namespace, CLONE_NETNS);
        if (ret < 0)
        	return -1;
        fd = socket( domain, type, protocol);
	setns(netns, CLONE_NETNS);
        return fd;
}

Which with a little bit care adding blocking of signals etc
that call can actually be made thread safe.

However if setns affected all threads of a multi-threaded process
socketat would require a system call to be written to do the
same job.

Multi-threaded processes that simultaneously deal with multiple
namespaces are likely to be rare but I expect there to be a few
that actually care.

Eric


> Cheers,
>
> Michael
>
> From: Michael Kerrisk <mtk.manpages@xxxxxxxxx>
> Date: Thu, Sep 15, 2011 at 6:13 AM
> Subject: Re: [PATCH 1/2] setns.2: Initial man page
> To: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> Cc: linux-man@xxxxxxxxxxxxxxx, "Serge E. Hallyn" <serge.hallyn@xxxxxxxxxxxxx>
>
>
> Hello Eric,
>
> See below.
>
> On Mon, May 30, 2011 at 5:16 AM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>>
>> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
>> ---
>>  man2/setns.2 |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 88 insertions(+), 0 deletions(-)
>>  create mode 100644 man2/setns.2
>>
>> diff --git a/man2/setns.2 b/man2/setns.2
>> new file mode 100644
>> index 0000000..8b48e14
>> --- /dev/null
>> +++ b/man2/setns.2
>> @@ -0,0 +1,88 @@
>> +.\" Copyright (C) 2011, Eric Biederman <ebiederm@xxxxxxxxxxxx>
>> +.\" Licensed under the GPLv2
>> +.\"
>> +.TH SETNS 2 2011-05-28 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +setns \- reassociate parts of the process execution context
>> +.SH SYNOPSIS
>> +.nf
>> +.BR "#define _GNU_SOURCE" "             /* See feature_test_macros(7) */"
>> +.B #include <sched.h>
>> +.sp
>> +.BI "int setns(int " fd ", int " nstype );
>> +.fi
>> +.SH DESCRIPTION
>> +Given a file descriptor referring to a namespace reassociate the
>> +current process with that namespace.
>> +
>> +The
>> +.I nstype
>> +argument is an enumeration that specifies which type of namespace
>> +the current process may be reassociated with.  This argument can
>> +have one of the following values:
>> +
>> +.TP
>> +.BR 0
>> +Allow any namespace to be joined.
>> +.TP
>> +.BR CLONE_NEWIPC
>> +Only allow joining an ipc namespace.
>> +.TP
>> +.BR CLONE_NEWNET
>> +Only allow joining a network namespace.
>> +.TP
>> +.BR CLONE_NEWUTS
>> +Only allow joining a uts namespace.
>> +.PP
>> +If
>> +.I flags
>> +is specified as zero, then
>> +.BR setns ()
>> +is a no-op;
>> +no changes are made to the calling process's execution context.
>> +.SH RETURN VALUE
>> +On success, zero returned.
>> +On failure, \-1 is returned and
>> +.I errno
>> +is set to indicate the error.
>> +.SH ERRORS
>> +.TP
>> +.TP
>> +.B EBADF
>> +A bad file descriptor was passed to setns.
>> +
>> +.TP
>> +.B EINVAL
>> +A file descriptor that does not match the specified nstype.
>> +
>> +Attempting to change the mount namespace and the filesystem
>> +is shared between multiple tasks.
>> +
>> +.TP
>> +.B ENOMEM
>> +Cannot allocate sufficient memory to change the specified namespace.
>> +
>> +.TP
>> +.B EPERM
>> +The calling process did not have the required privileges for this operation.
>> +.SH VERSIONS
>> +The
>> +.BR setns ()
>> +system call first appeared in Linux in kernel 3.0
>> +.SH CONFORMING TO
>> +The
>> +.BR setns ()
>> +system call is Linux-specific.
>> +.SH NOTES
>> +Not all of the process attributes that can be shared when
>> +a new process is created using
>> +.BR clone (2)
>> +can be changed using
>> +.BR setns ().
>> +.SH BUGS
>> +The pid namespace and the mount namespace are not currently supported.
>> +.SH SEE ALSO
>> +.BR clone (2),
>> +.BR fork (2),
>> +.BR vfork (2),
>> +.BR setns(2)
>> --
>> 1.7.5.1.217.g4e3aa
>
> I made various edits to the page, some after out F2F conversations.
> Could you please comment on the new version below?
>
> Note: we talked a couple of times about this piece of text under the
> EINVAL error.
>
>       Attempted  to  change  the  mount  namespace, but the filesystem
>       is shared between multiple tasks.
>
> As I understand it, this refers to interactions between the mount
> namespace and file system namespace. However, as noted in the man
> page, setns() does not support CLONE_NEWNS. Furthermore, I can see no
> path in the setns() that generates EINVAL and  involves CLONE_NEWNS.
> So,I removed that text. Please let me know if that's wrong.

Removing that text is fine for now.  I expect I will have to readd it
after I get my next round of patches in but no need to Document what
does not yet exist in mainline.


Reading the 

> .\" Copyright (C) 2011, Eric Biederman <ebiederm@xxxxxxxxxxxx>
> .\" Licensed under the GPLv2
> .\"
> .TH SETNS 2 2011-09-15 "Linux" "Linux Programmer's Manual"
> .SH NAME
> setns \- reassociate process with a namespace
> .SH SYNOPSIS
> .nf
> .BR "#define _GNU_SOURCE" "             /* See feature_test_macros(7) */"
> .B #include <sched.h>
> .sp
> .BI "int setns(int " fd ", int " nstype );
> .fi
> .SH DESCRIPTION
> Given a file descriptor referring to a namespace,
> reassociate the calling process with that namespace.
>
> The
> .I fd
> argument is a file descriptor referring to one of the namespace entries in a
> .I /proc/[pid]/ns/
> directory; see
> .BR proc (5)
> for further information on
> .IR /proc/[pid]/ns/ .
> The calling process will be reassociated with the corresponding namespace,
> subject to any constraints imposed by the
> .I nstype
> argument.
>

There is an weird twist I think it makes sense to document.  The unit of
reassociation is a linux task.  What is normally seen as a thread.

Which is important to consider if you happen to be using this in a
multi-threaded program.  But I'm not certain how best to say that.

Perhaps:  perhaps just say linux task instead of process?


> .TP
> .BR 0
> Allow any type of namespace to be joined.
> .TP
> .BR CLONE_NEWIPC
> .I fd
> must refer to an IPC namespace.
> .TP
> .BR CLONE_NEWNET
> .I fd
> must refer to a network namespace.
> .TP
> .BR CLONE_NEWUTS
> .I fd
> must refer to a UTS namespace.
> .PP
> Specifying
> .I nstype
> as 0 suffices if the caller knows (or does not care)
> what type of namespace is referred to by
> .IR fd .
> Specifying a nonzero value for
> .I nstype
> is useful if the caller does not know what type of namespace is referred to by
> .IR fd
> and wants to ensure that the namespace is of a particular type.
> (The caller might not know the type of the namespace referred to by
> .IR fd
> if the file descriptor was opened by another process and, for example,
> passed to the caller via a UNIX domain socket.)
> .SH RETURN VALUE
> On success,
> .IR setns ()
> returns 0.
> On failure, \-1 is returned and
> .I errno
> is set to indicate the error.
> .SH ERRORS
> .TP
> .B EBADF
> .I fd
> is not a valid file descriptor.
> .TP
> .B EINVAL
> .I fd
> refers to a namespace whose type does not match that specified in
> .IR nstype .

Just because we have been going back on forth on this bit I am inclined
to say:

EINVAL fd refers to a namespace whose type does not match that
specified in nstype, or there is problem with reassociating the
the thread with the specified namespace.

> .TP
> .B ENOMEM
> Cannot allocate sufficient memory to change the specified namespace.
> .TP
> .B EPERM
> The calling process did not have the required privilege
> .RB ( CAP_SYS_ADMIN )
> for this operation.
> .SH VERSIONS
> The
> .BR setns ()
> system call first appeared in Linux in kernel 3.0
> .SH CONFORMING TO
> The
> .BR setns ()
> system call is Linux-specific.
> .SH NOTES
> Not all of the process attributes that can be shared when
> a new process is created using
> .BR clone (2)
> can be changed using
> .BR setns ().
> .SH BUGS
> The PID namespace and the mount namespace are not currently supported.
> (See the descriptions of
> .BR CLONE_NEWPID
> and
> .BR CLONE_NEWNS
> in
> .BR clone (2).)
> .SH SEE ALSO
> .BR clone (2),
> .BR fork (2),
> .BR vfork (2),
> .BR proc (5),
> .BR unix (7)
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux