On 4/27/24 13:24, Stas Sergeev wrote:
This flag performs the open operation with the fs credentials
(fsuid, fsgid, group_info) that were in effect when dir_fd was opened.
dir_fd must be opened with O_CRED_ALLOW, or EPERM is returned.
Selftests are added to check for these properties as well as for
the invalid flag combinations.
This allows the process to pre-open some directories and then
change eUID (and all other UIDs/GIDs) to a less-privileged user,
retaining the ability to open/create files within these directories.
Design goal:
The idea is to provide a very light-weight sandboxing, where the
process, without the use of any heavy-weight techniques like chroot
within namespaces, can restrict the access to the set of pre-opened
directories.
This patch is just a first step to such sandboxing. If things go
well, in the future the same extension can be added to more syscalls.
These should include at least unlinkat(), renameat2() and the
not-yet-upstreamed setxattrat().
Security considerations:
- Only the bare minimal set of credentials is overridden:
fsuid, fsgid and group_info. The rest, for example capabilities,
are not overridden to avoid unneeded security risks.
- To avoid sandboxing escape, this patch makes sure the restricted
lookup modes are used. Namely, RESOLVE_BENEATH or RESOLVE_IN_ROOT.
- Magic /proc symlinks are discarded, as suggested by
Andy Lutomirski <luto@xxxxxxxxxx>> - O_CRED_ALLOW fds cannot be passed via unix socket and are always
closed on exec() to prevent "unsuspecting userspace" from not being
able to fully drop privs.
What about hard links?
== snip ==
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <stdarg.h>
#include <fcntl.h>
#include <unistd.h>
#include <linux/openat2.h>
#define O_CRED_ALLOW 0x2000000
#define OA2_CRED_INHERIT (1UL << 28)
#define SYS_openat2 437
long openat2(int dirfd, const char *pathname, struct open_how *how, size_t size) {
return syscall(SYS_openat2, dirfd, pathname, how, size);
}
__attribute__ ((noreturn, format(printf, 1, 2)))
static void die(const char *restrict fmt, ...) {
va_list ap;
va_start(ap, fmt);
vfprintf(stderr, fmt, ap);
va_end(ap);
_exit(1);
}
int main() {
unlink("/tmp/d/test.dat");
unlink("/tmp/d/hostname");
if (rmdir("/tmp/d") != 0 && errno != ENOENT)
die("/tmp/d: %m\n");
umask(0);
if (mkdir("/tmp/d", 0777) != 0)
die("/tmp/d: %m\n");
int dirfd = open("/tmp/d", O_RDONLY + O_CRED_ALLOW);
if (dirfd == -1)
die("/tmp/d: %m\n");
if (setuid(1000) != 0)
die("setuid: %m\n");
if (link("/etc/hostname", "/tmp/d/hostname") == -1)
die ("/etc/hostname: %m\n");
if(openat(dirfd, "hostname", O_RDWR) != -1)
die("/tmp/d/hostname could be opened by uid 1000");
{ struct open_how how = { .flags = O_RDWR + OA2_CRED_INHERIT, .resolve = RESOLVE_BENEATH };
if (openat2(dirfd, "hostname", &how, sizeof(how)) == -1)
die("hostname: %m\n");
printf("able to open /etc/hostname RDWR \n");
}
}
== snip ==
buczek@dose:~$ gcc -O0 -Wall -Wextra -Werror -g -o test test.c
buczek@dose:~$ sudo ./test
able to open /etc/hostname RDWR
buczek@dose:~$
--
Donald Buczek
buczek@xxxxxxxxxxxxx
Tel: +49 30 8413 1433