On Mon, Feb 01, 2021 at 06:47:17PM +0100, Jason A. Donenfeld wrote: > Hi Andy & others, > > I was reversing some NT stuff recently and marveling over how wild and > crazy things are over in Windows-land. A few things related to process > creation caught my interest: > > - It's possible to create a new process with an *arbitrary parent > process*, which means it'll then inherit various things like handles > and security attributes and tokens from that new parent process. > > - It's possible to create a new process with the memory space handle > of a different process. Consider this on Linux, and you have some > abomination like `forkat(int pidfd)`. > > The big question is "why!?" At first I was just amused by its presence > in NT. Everything is an object and you can usually freely mix and > match things, and it's very flexible, which is cool. But this is NT, > not Linux. > > Jann and I were discussing, though, that maybe some variant of these > features might be useful to get rid of setuid executables. Imagine > something like `systemd-sudod`, forked off of PID 1 very early. > Subsequently all new processes on the system run with > PR_SET_NO_NEW_PRIVS or similar policies to prevent non-root->root > transition. Then, if you want to transition, you ask systemd-sudod (or > polkitd, or whatever else you have in mind) to make you a new process, > and it then does the various policy checks, and executes a new process > for you as the parent of the requesting process. > > So how would that work? Well, executing processes with arbitrary > parents would be part of it, as above. But we'd probably want to more > carefully control that new process. Which chroot is it in? How do > cgroups work? And so on. And ultimately this design leads to something > like ZwCreateProcess, where you have several arguments, each to a > handle to some part of the new process state, or null to be inherited > from its parent. > > int execve_parent(int parent_pidfd, int root_dirfd, int cgroup_fd, int > namespace_fd, const char *pathname, char *const argv[], char *const > envp[]); > > One could imagine this growing pretty unwieldy. There's also this > other design aspect of Linux that's worth considering. Namespaces and > other process-inherited resources are generally hierarchical, with > children getting the resource from their parent. This makes sense and > is simple to conceptualize. Everytime we add a new thing_fd as a > pointer to one of these resources, and allow it to be used outside of > that hierarchy, it introduces a kind of "escape hatch". That might be > considered "bad design" by some; it might not be by others. Seen this > way, NT is one massive escape hatch, with pretty much everything being > an object with a handle. > > But! Maybe this is nonetheless an interesting design avenue to > explore. The introduction of pidfd is sort of just the "beginning" of > that kind of design. > > Is any of this interesting to you as a future of privilege escalation > and management on Linux? A bunch of this was discussed in a breakout room during Linux Plumbers last year and I also had discussions with Lennart about this a little while ago. One API I had proposed was to extend pidfd_open() to give you a pidfd that does not yet refer to any process, i.e. instead of int pidfd = pidfd_open(1234, 0); you could do int pidfd = pidfd_open(-1/-ESRCH, 0); which would give you an empty process handle without any mentionable properties. A simple/dumb design would then be to let clone3() not just return pidfds but also take pidfds as an argument. You could then hand-off the pidfd to another process SCM_RIGHTS/pidfd_getfd() and have it create a process for you with the privileges of the caller, you'd still be the parent. Or in addition to pidfd_open() we add new syscalls to configure a process context pidfd_configure() or sm. This design I initially proposed before we ended up with what we have now. So yes, I would love to have at least the concept to create a process for another process, delegated fork, essentially. Christian