Hello all, The SELinux namespace effort has been stuck for several years as we try to solve the problem of managing individual file labels across multiple namespaces. Our only solution thus far, adding namespace specific xattrs to each file, is relatively simple but doesn't scale, and has the potential to become a management problem as a namespace specific identifier needs to be encoded in the xattr name. Having continued to think about this problem, I believe I have an idea which might allow us to move past this problem and start making progress on SELinux namespaces. I'd like to get everyone's thoughts on the proposal below ... THE IDEA With the understanding that we only have one persistent label per-file, we need to get a little creative with how we represent a single entity's label in both the parent and child namespaces. Since our existing approach towards SELinux policy for containers and VMs (sVirt) is to treat the container/VM as a single security domain, let's continue this philosophy to a SELinux namespace: a child namespace will appear as a single SELinux domain/type in the parent namespace, with newly created processes and objects all appearing to have the same type from the parent's point of view. From the child namespace's perspective, everything will behave as they would normally: processes would run in multiple domains as determined by the namespace's policy, with files labeled according to the labeling rules defined in the namespace's policy (e.g. xattrs, context mounts, etc.). The one exception to this would be existing mounted filesystems that are shared between parent and child namespaces: shared filesytems would be labeled according to the namespace which mounted the filesystem originally (the parent, grandparent, etc.), and those file labels would be shared across all namespace boundaries. If a particular namespace does not have the necessary labels defined in its policy for a shared filesystem, those undefined labels will be represented just as bogus labels are represented today ("unlabeled_t"). For this to work well there must be shared understanding/types between the parent and child namespace SELinux policies, but if the namespaces are already sharing a filesystem this seems like a reasonable requirement. I'll leave this as an exercise for the reader, but this approach should also support arbitrary nesting. THOUGHTS ON MAKING IT WORK One of the bigger challenges here is how to handle the case of the parent mounting a filesystem for full use by the child namespace (per-file labeling, etc.). Above I talked about how filesystems would be labeled according to the mounting namespace, so if we want to delegate labeling of the filesystem to a child namespace (without allowing the child to perform the mount) we need to have a mechanism to indicate that the mounting namespace is deferring labeling to a different namespace. I think the obvious solution to that would be to add two new mount options: "selinuxns_outer=<label>" and "selinuxns_owner=<label>". The "selinuxns_outer" option would accomplish two things: mark the filesystem for deferred labeling by another namespace, and establish a single label, similar to a context mount, that the mounting namespace would see instead of whatever labeling the filesystem would normally support. The "selinuxns_owner" option would specify the domain label of the child namespace, granting that domain control over whatever labeling is supported by the filesystem. In most normal use cases where the child namespace runs with a single domain/type from the parent's perspective I would expect "selinuxns_outer" and "selinuxns_owner" to be set to the same value, although that is not a requirement. Triggering the creation of a child SELinux namespace, the userspace API in general, and the implementation work needed to support multiple views of the same kernel entities is all still very TBD/hand-wavy. I wanted to make sure the approach described here made sense first. THOUGHTS ON POLICY This is an area where I think the single-label parent view makes it much easier to develop policy for containing child namespaces. Since we want the parent namespace to effectively bound the access of the child namespace, treating the namespace as a single domain allows the parent to develop policy independent of what the child's types and behaviors; the parent simply describes the allowed interactions and let's the child manage it's own policy and labeling. Filesystems shared across policy boundaries are somewhat interesting in that for it to be fully usable it requires every participating namespace to have the filesystem labels defined in their own policy, but it does not require each namespace to treat the files in the same manner. However, it is important to note that regardless of what a child namespace might allow in a shared filesystem, it is still subject to the policy rules of any parent namespaces. -- paul-moore.com