On Tue, Oct 17, 2017 at 11:44 AM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 2017-10-17 at 11:28 -0400, Simo Sorce wrote: >> > Without a *kernel* policy on containerIDs you can't say what >> > security policy is being exempted. >> >> The policy has been basically stated earlier. >> >> A way to track a set of processes from a specific point in time >> forward. The name used is "container id", but it could be anything. >> This marker is mostly used by user space to track process hierarchies >> without races, these processes can be very privileged, and must not >> be allowed to change the marker themselves when granted the current >> common capabilities. >> >> Is this a good enough description ? If not can you clarify your >> expectations ? > > I think you mean you want to be able to apply a label to a process > which is inherited across forks. The label should only be susceptible > to modification by something possessing a capability (which one TBD). > The idea is that processes spawned into a container would be labelled > by the container orchestration system. It's unclear what should happen > to processes using nsenter after the fact, but policy for that should > be up to the orchestration system. > > The label will be used as a tag for audit information. > > I think you were missing label inheritance above. That is a pretty good summary of what we want to do, and what Richard and I have discussed while brainstorming this offline. The details may not have translated well into those initial emails from Richard, but I think you've got the idea, even if some of the smaller details are still TBD. FWIW, right now I'm not as worried about the exact capability or the size of the audit container ID, I think those things will sort themselves out as we progress through the implementation, especially once we get to the next stage when we start to allow copies of the audit records to be routed to audit daemons running inside containers (note well that I said "copies", the host system still sees all). > The security implications are that anything that can change the label > could also hide itself and its doings from the audit system and thus > would be used as a means to evade detection. I actually think this > means the label should be write once (once you've set it, you can't > change it) ... Richard and I have talked about a write once approach, but the thinking was that you may want to allow a nested container orchestrator (Why? I don't know, but people always want to do the craziest things.) and a write-once policy makes that impossible. If we punt on the nested orchestrator, I believe we can seriously think about a write-once policy to simplify things. A bit off topic, but I've also wondered about not even implementing read access, just to help ensure the audit container ID wouldn't be abused, but I'm not sure how practical that will be. Something else to sort out during the RFC phase of the implementation with the container orchestrators. > ... and orchestration systems should begin as unlabelled > processes allowing them to do arbitrary forks. My current thinking is that the default state is to start unlabeled (I just vomited a little into my SELinux hat); in other words init/systemd/PID-1 in the host system starts with an "unset" audit container ID. This not only helps define the host system (anything that has an unset audit container ID) but provides a blank slate for the orchestrator(s). > For nested containers, I actually think the label should be > hierarchical, so you can add a label for the new nested container but > it still also contains its parents label as well. I haven't made up my mind on this completely just yet, but I'm currently of the mindset that supporting multiple audit container IDs on a given process is not a good idea. -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html