On Tue, Apr 10, 2012 at 6:01 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > Andrew Lutomirski <luto@xxxxxxx> writes: > >> On Tue, Apr 10, 2012 at 4:50 PM, Eric W. Biederman >> <ebiederm@xxxxxxxxxxxx> wrote: >>> Andrew Lutomirski <luto@xxxxxxx> writes: >>> >>>> On Tue, Apr 10, 2012 at 2:59 PM, Eric W. Biederman >>>> <ebiederm@xxxxxxxxxxxx> wrote: >>>>> Andy Lutomirski <luto@xxxxxxx> writes: >>>>> >>>>> My understanding of no_new_privs is that current_cred() including >>>>> the user, the user namespace and the security label will never change, >>>>> with the goal of making the security analysis simple. >>>> >>>> They can change but only if you already have the privilege to change >>>> them yourself and then you do so. For example, PR_SET_NO_NEW_PRIVS, >>>> setuid, then drop caps is allowed and useful -- it's a race-free way >>>> to make sure that a given uid never executes without no_new_privs set. >>>> I've implemented this as a pam module. >>> >>> Careful. There is the security_task_fix_setuid call that will raise >>> your capabilities from cap->effective to cap->permitted if you call >>> setuid(0). Which in the general case means you can regain all of the >>> root privileges if you only have CAP_SETUID. >>> >> >> That's fine. If you're running with CAP_SETUID and default >> securebits, then you effectively have all capabilities already and >> don't need to exploit setuid binaries to gain them. no_new_privs >> doesn't change that. If you don't want to be able to gain all privs, >> change securebits or drop CAP_SETUID. seccomp reduces the kernel >> attack surface; no_new_privs reduces the userspace attack surface. >> But see below... >> >> >>> >>>>> I don't recall how seccomp filters are dealt with if you don't have >>>>> no_new_privs enabled. If seccomp filters installed by root >>>>> are dropped when we change privilege levels it might be worth looking >>>>> at how to keep a seccomp filter installed as long as you stay in >>>>> a user namespace. >>>>> >>>> >>>> They're not dropped. I think in the current implementation they can't >>>> be dropped at all. >>> >>> Which makes sense. Is this why you need no_new_privs? So you can't run >>> seccomp on higher privileged executables and confusing them into keeping >>> privileges when they should not? >> >> Exactly. seccomp is flexible enough that it's probably possible to >> confuse many setuid executables with it. >> >>> >>>>> The emphasis is a bit different from new_new_privs as the user_namespace >>>>> does not need to guarantee that the lsm will not change security labels, >>>>> etc. >>>> >>>> Hmm. Is this safe? For example, if there's a program that LSM policy >>>> grants extra privileges that malfunctions when run inside a user >>>> namespace, can that be used to break out of LSM restrictions? >>> >>> I can't see how it would not be safe. >>> >>> Except for the user namespace pointer the state the LSM and the rest of >>> the kernel sees is the same state the kernel sees. Aka userspace sees >>> uid 0, the LSM does not. So I don't know why a LSM would get confused. >>> >>> Beyond that it is a bug for an LSM to grant permissions beyond the >>> core DAC model. So the worst I can see is an LSM not grokking user >>> namespaces and getting confused and not restricting a process as >>> much as the designer of the LSM would like. >> >> Right. Suppose you have some program that has extra restrictions >> applied by an LSM. It executes a helper (e.g. Apache's suidexec >> thing, but I bet there are more examples) which is supposed to be very >> careful not to leak privileges. The LSM is set to restrict that >> helper less than the parent process. But that program was written >> before user namespaces existed, and it has a bug (or missing feature) >> that allows its parent to exploit it when run inside an unmapped user >> namespace. The parent can now escape from the LSM restrictions. >> >> no_new_privs is designed to prevent exactly this issue. > > Currently the suid exec will fail because the uid's don't map. > > I might switch that around to simply ignoring the change of uid > on suid exec. I have a patch in my devel tree that plays with > that idea. However as much as I hit that case once in testing > (I think it was ping). I don't think running suid executables > is particularly interesting. > > Certainly the application program won't care or break, because we are > still bounded by the usaual DAC security. > > I wonder a little if the lsm might change labels on exec of a > non suid binary. That case is more interesting in the unmapped > unprivileged user namespace. > > But I just can't seem to care. The LSM is the line behind which we hide > the crazy. Sounds like you're reinventing (something very similar to) no_new_privs. Why not just require no_new_privs as a prerequisite for creating a user namespace if you're unprivileged? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html