On Fri, Oct 13, 2023 at 12:16 PM Stephen Smalley <stephen.smalley.work@xxxxxxxxx> wrote: > On Wed, Oct 11, 2023 at 6:55 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > > > Hello all, > > > > The SELinux namespace effort has been stuck for several years as we > > try to solve the problem of managing individual file labels across > > multiple namespaces. Our only solution thus far, adding namespace > > specific xattrs to each file, is relatively simple but doesn't scale, > > and has the potential to become a management problem as a namespace > > specific identifier needs to be encoded in the xattr name. Having > > continued to think about this problem, I believe I have an idea which > > might allow us to move past this problem and start making progress on > > SELinux namespaces. I'd like to get everyone's thoughts on the > > proposal below ... > > > > THE IDEA > > > > With the understanding that we only have one persistent label > > per-file, we need to get a little creative with how we represent a > > single entity's label in both the parent and child namespaces. Since > > our existing approach towards SELinux policy for containers and VMs > > (sVirt) is to treat the container/VM as a single security domain, > > let's continue this philosophy to a SELinux namespace: a child > > namespace will appear as a single SELinux domain/type in the parent > > namespace, with newly created processes and objects all appearing to > > have the same type from the parent's point of view. From the child > > namespace's perspective, everything will behave as they would > > normally: processes would run in multiple domains as determined by the > > namespace's policy, with files labeled according to the labeling rules > > defined in the namespace's policy (e.g. xattrs, context mounts, etc.). > > I don't have any problems with the idea. However, where I got stuck > with the original selinux namespace patches was not per-namespace > filesystem security xattrs (which was James' contribution) ... Thanks for taking a look and reviewing the idea. The multiple xattr approach is okay-ish if you need *something* to get past the labeling hurdle so you can work on other aspects of the implementation, but it is not a viable solution upstream. The scaling and management difficulties make it a non-starter in my opinion. > ... but rather > the need to support per-namespace in-core inode and superblock > security blobs. I didn't get into any implementation details because I was still wrestling with the design issue of how do we deal with only a single on-disk label. In my mind that needed to be solved before we could spend time thinking about how to implement it in a reasonable manner. Regardless, you bring up a good point: there are still a number of implementation challenges with namespacing SELinux. There is the issue of supporting multiple labels on kernel entities such as superblock, inodes, tasks, etc., but I have a hunch that the more challenging issue is going to be the various LSM/SELinux external "things" that deal with a single secid/secctx. We will have to see how things develop with the LSM stacking effort, but that might end up helping us in that area, we'll have to see how that goes. > You'd have to go back to my original posted patch > series or the older selinuxns branches of my github repo to see my > attempt at supporting those because they were dropped from the > working-selinuxns branch due to the ongoing reworking of LSM to handle > blob allocation by the security framework rather than by the > individual security modules. I couldn't figure out how to make that > work safely and efficiently, and AFAICT that still has to be addressed > for the above idea to work. Agreed, supporting multiple different views of an entity and mapping that a namespace in a quick and efficient manner is probably going to be one of the larger technical challenges. I won't pretend to have a answer for this yet, but I do believe we can figure something out. > > THOUGHTS ON MAKING IT WORK > > > > One of the bigger challenges here is how to handle the case of the > > parent mounting a filesystem for full use by the child namespace > > (per-file labeling, etc.). Above I talked about how filesystems would > > be labeled according to the mounting namespace, so if we want to > > delegate labeling of the filesystem to a child namespace (without > > allowing the child to perform the mount) we need to have a mechanism > > to indicate that the mounting namespace is deferring labeling to a > > different namespace. I think the obvious solution to that would be to > > add two new mount options: "selinuxns_outer=<label>" and > > "selinuxns_owner=<label>". The "selinuxns_outer" option would > > accomplish two things: mark the filesystem for deferred labeling by > > another namespace, and establish a single label, similar to a context > > mount, that the mounting namespace would see instead of whatever > > labeling the filesystem would normally support. The "selinuxns_owner" > > option would specify the domain label of the child namespace, granting > > that domain control over whatever labeling is supported by the > > filesystem. In most normal use cases where the child namespace runs > > with a single domain/type from the parent's perspective I would expect > > "selinuxns_outer" and "selinuxns_owner" to be set to the same value, > > although that is not a requirement. > > So with my earlier patch set (the one in my older selinuxns branch), > one could already do the equivalent of selinuxns_outer just using the > existing context= mount option. This is because it allowed for > per-namespace superblock security blobs, so you could context mount in > the parent namespace while still selecting per-file labeling in the > child. That said, it had the issues I referenced above wrt safety and > efficiency. I'm open to other ideas on how to make this work, but I do think it is important to separate a context mounted filesystem that is shared across multiple namespaces from a "selinuxns_outer" (or whatever we want to use) mounted filesystem that appears like a context mount to the mounting namespace while supporting native labeling for a child namespace. If there is a clever way to do that with existing mount options that's even better. > For selinuxns_owner, I'm not clear on where/how that would > be used. My idea behind the "selinuxns_owner" owner was to provide an easy way to identify which domains in the mounting/parent namespace would have the ability to enable native filesystem labeling in their own child namespace. It seemed like a reasonable way to simplify the SELinux namespace API by not needing to specify all of the mounted filesystems that the child namespace would have labeling control over when the child namespace was initiated. > Note that the context you assign to files will quite often > differ from the context assigned to the processes; hence, if > selinuxns_owner is meant to be the context of a process, it usually > won't be the same as selinux_outer. That came up in some offline discussions too. When I was kicking things around in my mind it was easier to reason about things, especially nested namespaces, if the child namespace was represented by a single type in the parent namespace. Unfortunately that simplification ended up leaking out into my email. If it makes things easier to read/understand, I would recommend simply removing that sentence from my email, it shouldn't affect the design either way. -- paul-moore.com