On Mon, May 16, 2016 at 04:15:23PM -0500, Serge E. Hallyn wrote: > Quoting Serge E. Hallyn (serge@xxxxxxxxxx): > ... > > There's a problem though. The above suffices to prevent an unprivileged user > > in a user_ns from unsharing a user_ns to write a file capability and exploit > > that capability in the ns where he is unprivileged. With one exception, which > > is the case where the unprivileged user is mapped to the same kuid which > > created the namespace. So if uid 1000 on the host creates a namespace > > where uid 1000 maps to 1000 in the namespace, then 1000 in the namespace > > can create a new user_ns, write the xattr, and exploit it from the > > parent namespace. This is not an uncommon case. I'm not sure what to do about > > it. > > Ok I think I've convinced myself that requiring a kuid 0 in the container > and storing that in the security.nscapability is best solution. The DAC > objection is imo not really valid - we don't have to give uid 0 in the > container any special privilege, we just require that the ns have a uid 0 > mapping. I have not been able to think of any other reliable way to verify > that the writer of the capability is authorized to grant privilege to the > file when executed by current. > > I'm going to proceed with another POC based on the following design: > > 1. no new syscalls at the moment. You can choose to set/query > security.nscapability, but can also just set security.capability from > a user_ns and have the kernel transparently set a security.nscapability > entry for you. > > 2. For now just a single security.nscapability entry, but in a format > that turning it into an array will be a trivial change > > 3. When running file foo which has a security.nscapability for kuid 100000, > then any namespace where kuid 100000 is root - or which has an ancestor ns where > that is the case - will run the file with the listed capabilities. > > 4. When doing getxattr of security.capability from a user_ns, if there is a > security.capability entry, that will be returned; else if there is a valid > security.nscapability for your ns, that will be returned. > > 5. when doing a setxattr of security.capability from a user_ns, if there is > a security.nscapability entry, you get EBUSY; else a security.nscapability > with your root kuid will be written provided that (a) you are privileged > over your namespace, (b) you are privileged over your root uid, (c) the > file owner maps into your namespace. Stéphane pointed out this isn't quite right. The EBUSY will happen if a security.nscapability is defined with a kuid over which the writer is not privileged - else it will overwrite. It will also happen if security.capbility is set. > 6. when doing a getxattr of security.nscapability, the entry will be shown > with kuid mapped into your namespace or -1 if the uid does not map into > your ns. > > 7. when doing a setxattr of security.nscapability, if an entry exists, you > get -EBUSY; if you are not privileged over your ns, your root uid, and > the file owner, then you get -EPERM; the xattr includes a uid field, which > must be either 0 or a value valid in your ns. The value will be converted > to a kuid and stored on disk. (Seth, I'm not sure offhand how that should > mesh with your patches, we can talk about it after I send the next patch, > which I'm quite certain will handle it wrongly) > > 8. If a security.capability exists, it will override any security.nscapability > at execve() (so, inverse of my previous two patches). > > -serge -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html