On Thu, Jul 21, 2011 at 11:21:07AM -0700, Linus Torvalds wrote: > I think we could have a pretty simple approach that "works in > practice": retain the check at setuid() time, but make it a higher > limit. > > IOW, the logic is that we have two competing pressures: > > (a) we should try to avoid failing on setuid(), because there is a > real risk that the setuid caller doesn't really check the failure case > and opens itself up for a security problem > > and > > (b) never failing setuid at all is in itself a security problem, > since it can lead to DoS attacks in the form of excessive resource use > by one user. I don't recall anyone stating (b) the way you did above (or sufficiently similar). I wouldn't consider setuid() never failing to be a security problem. BTW, some people consider setuid() failing on RLIMIT_NPROC kernel "brokenness", which applications have to "work around": http://www.openwall.com/lists/musl/2011/07/21/3 "I'm aware of course that some interfaces *can* fail for nonstandard reasons under Linux (dup2 and set*uid come to mind), and I've tried to work around these and shield applications from the brokenness..." So opinions on setuid() failing vary, whereas (a) is clear - there have been vulnerabilities caused by setuid() failing. The DoS that you mention doesn't necessarily have to be dealt with by setuid() failing on RLIMIT_NPROC (nor on a higher limit). It could also be dealt with by checking the limit on execve(), like we've been doing on shared web hosting servers for years, and, if desired, by letting applications like Android/Zygote check for the condition themselves via a new prctl() (or they can simply pass an extra fork(), although that's a bit costly). > IOW, I'd suggest simply making the rule be that "setuid() allows 10% > more users than the limit technically says". It's not a guarantee, but > it means that in order to hit the problem, you need to have *both* a > setuid application that allows unconstrained user forking *and* > doesn't check the setuid() return value. > > Put another way: a user cannot force the "we're at the edge of the > setuid() limit" on its own by just forking - the user will be stopped > 10% before the setuid() failure case can ever trigger. For a malicious user, this merely adds the task of triggering a race condition - have a sufficient number of processes accumulate in the between setuid() and execve() state. If the program in question can be made to sleep, this may be trivial to do. Otherwise, it may require very rapid requests (automated) and high system load. (BTW, 10% of 0 would be 0, which would allow for attacks that are as simple as they're now, but that's an implementation detail. Of course, you'd actually add some constant as well.) > Is this some "guarantee of nothing bad can ever happen"? No. If you > have bad setuid applications, you will have problems. But it's a "you > need to really work harder at it and you need to find more things to > go wrong", which is after all what real security is all about. > > No? I generally support having multiple layers of security even if some are non-perfect, but in this case we have a problem that we can _fully_ deal with rather than merely make attacks harder. So my proposal remains to go with Vasiliy's trivial patch and maybe add a few things on top of it as I mentioned in my previous message. Thanks, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html