Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx> --- This is, I think, available here: https://git.kernel.org/?p=linux/kernel/git/luto/linux.git;a=shortlog;h=refs/heads/security/no_new_privs/docs_v1 IMO this is 3.5 material -- it documents a new feature in 3.5. Documentation/prctl/no_new_privs.txt | 50 ++++++++++++++++++++++++++++++++++ 1 files changed, 50 insertions(+), 0 deletions(-) create mode 100644 Documentation/prctl/no_new_privs.txt diff --git a/Documentation/prctl/no_new_privs.txt b/Documentation/prctl/no_new_privs.txt new file mode 100644 index 0000000..cb705ec --- /dev/null +++ b/Documentation/prctl/no_new_privs.txt @@ -0,0 +1,50 @@ +The execve system call can grant a newly-started program privileges that +its parent did not have. The most obvious examples are setuid/setgid +programs and file capabilities. To prevent the parent program from +gaining these privileges as well, the kernel and user code must be +careful to prevent the parent from doing anything that could subvert the +child. For example: + + - The dynamic loader handles LD_* environment variables differently if + a program is setuid. + + - chroot is disallowed to unprivileged processes, since it would allow + /etc/passwd to be replaced from the point of view of a process that + inherited chroot. + + - The exec code has special handling for ptrace. + +These are all ad-hoc fixes. The no_new_privs bit (since Linux 3.5) is a +new, generic mechanism to make it safe for a process to modify its +execution environment in a manner that persists across execve. Any task +can set no_new_privs. Once the bit is set, it is inherited across fork, +clone, and execve and cannot be unset. With no_new_privs set, execve +promises not to grant the privilege to do anything that could not have +been done without the execve call. For example, the setuid and setgid +bits will no longer change the uid or gid; file capabilities will not +add to the permitted set, and LSMs will not relax constraints after +execve. + +Note that no_new_privs does not prevent privilege changes that do not +involve execve. An appropriately privileged task can still call +setuid(2) and receive SCM_RIGHTS datagrams. + +There are two main use cases for no_new_privs so far: + + - Filters installed for the seccomp mode 2 sandbox persist across + execve and can change the behavior of newly-executed programs. + Unprivileged users are therefore only allowed to install such filters + if no_new_privs is set. + + - By itself, no_new_privs can be used to reduce the attack surface + available to an unprivileged user. If everything running with a + given uid has no_new_privs set, then that uid will be unable to + escalate its privileges by directly attacking setuid, setgid, and + fcap-using binaries; it will need to compromise something without the + no_new_privs bit set first. + +In the future, other potentially dangerous kernel features could become +available to unprivileged tasks if no_new_privs is set. In principle, +several options to unshare(2) and clone(2) would be safe when +no_new_privs is set, and no_new_privs + chroot is considerable less +dangerous than chroot by itself. -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html