"Serge E. Hallyn" <serge@xxxxxxxxxx> writes: > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx): >> Stefan Berger <stefanb@xxxxxxxxxxxxxxxxxx> writes: >> >> > On 07/13/2017 01:14 PM, Eric W. Biederman wrote: >> >> Theodore Ts'o <tytso@xxxxxxx> writes: >> >> >> >>> On Thu, Jul 13, 2017 at 07:11:36AM -0500, Eric W. Biederman wrote: >> >>>> The concise summary: >> >>>> >> >>>> Today we have the xattr security.capable that holds a set of >> >>>> capabilities that an application gains when executed. AKA setuid root exec >> >>>> without actually being setuid root. >> >>>> >> >>>> User namespaces have the concept of capabilities that are not global but >> >>>> are limited to their user namespace. We do not currently have >> >>>> filesystem support for this concept. >> >>> So correct me if I am wrong; in general, there will only be one >> >>> variant of the form: >> >>> >> >>> security.foo@uid=15000 >> >>> >> >>> It's not like there will be: >> >>> >> >>> security.foo@uid=1000 >> >>> security.foo@uid=2000 >> >>> >> >>> Except.... if you have an Distribution root directory which is shared >> >>> by many containers, you would need to put the xattrs in the overlay >> >>> inodes. Worse, each time you launch a new container, with a new >> >>> subuid allocation, you will have to iterate over all files with >> >>> capabilities and do a copy-up operations on the xattrs in overlayfs. >> >>> So that's actually a bit of a disaster. >> >>> >> >>> So for distribution overlays, you will need to do things a different >> >>> way, which is to map the distro subdirectory so you know that the >> >>> capability with the global uid 0 should be used for the container >> >>> "root" uid, right? >> >>> >> >>> So this hack of using security.foo@uid=1000 is *only* useful when the >> >>> subcontainer root wants to create the privileged executable. You >> >>> still have to do things the other way. >> >>> >> >>> So can we make perhaps the assertion that *either*: >> >>> >> >>> security.foo >> >>> >> >>> exists, *or* >> >>> >> >>> security.foo@uid=BAR >> >>> >> >>> exists, but never both? And there BAR is exclusive to only one >> >>> instances? >> >>> >> >>> Otherwise, I suspect that the architecture is going to turn around and >> >>> bite us in the *ss eventually, because someone will want to do >> >>> something crazy and the solution will not be scalable. >> >> Yep. That is what it looks like from here. >> >> >> >> Which is why I asked the question about scalability of the xattr >> >> implementations. It looks like trying to accomodate the general >> >> case just gets us in trouble, and sets unrealistic expectations. >> >> >> >> Which strongly suggests that Serge's previous version that >> >> just reved the format of security.capable so that a uid field could >> >> be added is likely to be the better approach. >> >> >> >> I want to see what Serge and Stefan have to say but the case looks >> >> pretty clear cut at the moment. > > I'm fine with that. Now, we'll be doing the enforcement at xattr > write time, meaning someone *can* come up with an fs image with >1 > such xattrs. Which is *fine*, I believe, it won't break anything > security-wise, and our goal is only to stop users from thinking it > is legitimate two write multiple such xattrs, so that they don't later > bug the fs folks like Ted saying "hey why can't I write 1000 of these, > I think that's a bug." > > So at xattr write time, > > 1. if there is already an xattr, and it is either the global > non-namespaced xattr, or it has kuid=X where X is the kuid > mapped to root in a parent of the container, then we refuse > the write > 2. if there is already an xattr, and it is for a kuid=X where > X is mapped into the container, then we overwrite the existing > xattr. > > At read/use time, we use the rules we have now. > > Does that seem reasonable? That sounds like it would keep us to one xattr of any given type so yes. It occurs to me while I am writing this that this is also important for ima/evm. There is an xattr that has a hash of all of the other security relevant xattrs. Without a limit on the number of xattrs calculating that security xattr could become time prohibitive. Eric _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers