Hi Andy, I have applied your patch (below). Thanks for writing it. But I have a question or two and a request. === In the capabilities(7) page tehre is the longstanding text: An application can use the following call to lock itself, and all of its descendants, into an environment where the only way of gaining capabilities is by executing a program with associ‐ ated file capabilities: prctl(PR_SET_SECUREBITS, SECBIT_KEEP_CAPS_LOCKED | SECBIT_NO_SETUID_FIXUP | SECBIT_NO_SETUID_FIXUP_LOCKED | SECBIT_NOROOT | SECBIT_NOROOT_LOCKED); As far as I can estimate, no changes are needed here to include SECBIT_NO_CAP_AMBIENT_RAISE and SECBIT_NO_CAP_AMBIENT_RAISE_LOCKED in the above prctl() call, but could you confirm please? === In the message for kernel commit 58319057b7847667f0c9585b9de0e8932b0fdb08 you included this text: [[ Because capability inheritance is so broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and then calling execve effectively drops capabilities. Therefore, setresuid from root to nonroot conditionally clears pA unless SECBIT_NO_SETUID_FIXUP is set. Processes that don't like this can re-add bits to pA afterwards. ]] I'm struggling to understand the significance of this text, especially as your man-pages patch makes no mention of this point. The thing is, that text ("Therefore...") implies that there's something special going on beyond the rules already documented elsewhere. I mean, according to the rules aly documented elsewhere in the page: (1) If a process with UIDs of 0 sets all its UIDs nonzero, then, the permitted and effective sets are cleared (that's the classical behavior), and because the permitted set is cleared, then so is the ambient set. (2) And if we set SECBIT_NO_SETUID_FIXUP then a UID 0 ==> nonzero transition doesn't clear permitted and effective sets, and then of course the ambient set is not cleared. So, what additional point were you meaning to convey in the commit message? (Maybe it was just cruft in the commit message, but if not, can you explain precisely the arguments for setresuid() that are supposed to generate the special behavior described by the above text.) === I did quite a bit of tweaking of the text that you added for the capabilities page. Could you please check the following to make sure I added no errors: Ambient (since Linux 4.3): This is a set of capabilities that are preserved across an execve(2) of a program that is not privileged. The ambient capability set obeys the invariant that no capa‐ bility can ever be ambient if it is not both permitted and inheritable. The ambient capability set can be directly modified using prctl(2). Ambient capabilities are automatically lowered if either of the corresponding permitted or inheritable capabilities is lowered. Executing a program that changes UID or GID due to the set-user-ID or set-group-ID bits or executing a program that has any file capabilities set will clear the ambi‐ ent set. Ambient capabilities are added to the permit‐ ted set and assigned to the effective set when execve(2) is called. Cheers, Michael On 11/04/2015 12:42 AM, Andy Lutomirski wrote: > Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx> > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx> > --- > > Changes from v2: Add a note about arg3 == 0 in CLEAR_ALL. > > man2/prctl.2 | 13 +++++++++++++ > man7/capabilities.7 | 40 ++++++++++++++++++++++++++++++++++------ > 2 files changed, 47 insertions(+), 6 deletions(-) > > diff --git a/man2/prctl.2 b/man2/prctl.2 > index e743a6305969..bf8680f3b62d 100644 > --- a/man2/prctl.2 > +++ b/man2/prctl.2 > @@ -954,6 +954,19 @@ had been called. > For further information on Intel MPX, see the kernel source file > .IR Documentation/x86/intel_mpx.txt . > .\" > +.TP > +.BR PR_CAP_AMBIENT " (since Linux 4.2)" > +Reads or changes the ambient capability set. If arg2 is PR_CAP_AMBIENT_RAISE, > +then the capability specified in arg3 is added to the ambient set. This will > +fail, returning EPERM, if the capability is not already both permitted and > +inheritable or if the SECBIT_NO_CAP_AMBIENT_RAISE securebit is set. If arg2 > +is PR_CAP_AMBIENT_LOWER, then the capability specified in arg3 is removed > +from the ambient set. If arg2 is PR_CAP_AMBIENT_IS_SET, then > +.BR prctl (2) > +will return 1 if the capability in arg3 is in the ambient set and 0 if not. > +If arg2 is PR_CAP_AMBIENT_CLEAR_ALL, then all capabilities will > +be removed from the ambient set. (Using PR_CAP_AMBIENT_CLEAR_ALL requires > +setting arg3 to zero.) > .SH RETURN VALUE > On success, > .BR PR_GET_DUMPABLE , > diff --git a/man7/capabilities.7 b/man7/capabilities.7 > index 616189c881e4..8934d05a5b07 100644 > --- a/man7/capabilities.7 > +++ b/man7/capabilities.7 > @@ -700,13 +700,34 @@ a program whose associated file capabilities grant that capability). > .IR Inheritable : > This is a set of capabilities preserved across an > .BR execve (2). > -It provides a mechanism for a process to assign capabilities > -to the permitted set of the new program during an > -.BR execve (2). > +Inheritable capabilities remain inheritable when executing any program, > +and inheritable capabilities are added to the permitted set when executing > +a program that has the corresponding bits set in the file inheritable set. > +.IP > +Because inheritable capabilities are not generally preserved across > +.BR execve (2) > +when running as a non-root user, applications that wish to run helper > +programs with elevated capabilities should consider using ambient capabilities, > +described below. > .TP > .IR Effective : > This is the set of capabilities used by the kernel to > perform permission checks for the thread. > +.TP > +.IR Ambient " (since Linux 4.3):" > +This is a set of capabilities that are preserved across an > +.BR execve (2) > +of a program that does not have file capabilities. The ambient capability > +set obeys the invariant that no capability can ever be ambient if it is > +not both permitted and inheritable. Ambient capabilities are > +preserved in the permitted set and added to the effective > +set when > +.BR execve (2) > +is called. The ambient capability set is modified using > +.BR prctl (2). > +Executing a program that changes uid or gid due to the setuid or setgid > +bits or executing a program that has any file capabilities set will clear > +the ambient set. > .PP > A child created via > .BR fork (2) > @@ -788,10 +809,12 @@ the process using the following algorithm: > .in +4n > .nf > > +P'(ambient) = (file has capabilities or is setuid or setgid) ? 0 : P(ambient) > + > P'(permitted) = (P(inheritable) & F(inheritable)) | > - (F(permitted) & cap_bset) > + (F(permitted) & cap_bset) | P'(ambient) > > -P'(effective) = F(effective) ? P'(permitted) : 0 > +P'(effective) = F(effective) ? P'(permitted) : P'(ambient) > > P'(inheritable) = P(inheritable) [i.e., unchanged] > > @@ -1074,6 +1097,10 @@ an effective or real UID of 0 calls > .BR execve (2). > (See the subsection > .IR "Capabilities and execution of programs by root" .) > +.TP > +.B SECBIT_NO_CAP_AMBIENT_RAISE > +Setting this flag disallows > +.BR PR_CAP_AMBIENT_RAISE . > .PP > Each of the above "base" flags has a companion "locked" flag. > Setting any of the "locked" flags is irreversible, > @@ -1082,8 +1109,9 @@ corresponding "base" flag. > The locked flags are: > .BR SECBIT_KEEP_CAPS_LOCKED , > .BR SECBIT_NO_SETUID_FIXUP_LOCKED , > +.BR SECBIT_NOROOT_LOCKED , > and > -.BR SECBIT_NOROOT_LOCKED . > +.BR SECBIT_NO_CAP_AMBIENT_RAISE . > .PP > The > .I securebits > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html