On Wed, Aug 7, 2024 at 1:02 PM Stephen Smalley <stephen.smalley.work@xxxxxxxxx> wrote: > > On Tue, Aug 6, 2024 at 12:56 PM Stephen Smalley > <stephen.smalley.work@xxxxxxxxx> wrote: > > With these changes applied, the following sequence works to > > demonstrate creating a new SELinux namespace: > > # Ask to unshare SELinux namespace on next exec > > $ echo 1 > /sys/fs/selinux/unshare > > # Unshare the mount and network namespaces too. > > # This is required in order to create our own selinuxfs mount for the > > # new namespace and to isolate our own SELinux netlink socket. > > $ unshare -m -n > > # Mount our own selinuxfs instance for our new SELinux namespace > > $ mount -t selinuxfs none /sys/fs/selinux > > # Load a policy into our SELinux namespace > > $ load_policy > > # Create a shell in the unconfined user/role/domain > > $ runcon unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 /bin/bash > > $ setenforce 1 > > $ id > > uid=0(root) gid=0(root) groups=0(root) > > context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 > > > > NB This new namespace is NOT currently confined by its parent. And > > there remain many unresolved issues. > > A couple of additional changes pushed, one to fix a bug in the inode > handling and another to introduce support for revalidating superblock > SIDs and updating them as needed for the namespace. With these > changes, the selinux-testsuite filesystem-related tests appear to pass > within a new SELinux namespace. Other tests vary - some pass, some > fail, some hang. I think before we proceed further with the SELinux namespaces support, we need to decide on how we are going to handle SIDs since that has a significant impact on the approach. There are (at least) two options: 1) SIDs are maintained per-namespace; this is the current approach in the patch series since the existing SELinux SID table (sidtab) is actually a mapping from SIDs to security context structures, not strings (with the exception of undefined contexts with the deferred mapping support), where the structures contain the policy indices for the relevant user/role/type/level. 2) SIDs are maintained globally, e.g. we introduce a new SID table outside of the security server that maps SIDs to security context strings (hence policy-independent). This would be more akin to Smack's known label list that is also used to assign SIDs, and would provide a stable pointer for context strings that could be cached in the inode security blobs without needing to maintain per-inode copies of the context strings. I started with approach #1 because that was how the existing SID table works within SELinux. However, approach #2 has a number of advantages: - It matches the LSM hook interface handling of secids, which assume that secids are global identifiers and allows kernel data structures outside of LSM to cache and pass secids back into the LSM later, e.g. for audit and networking. - It avoids the need to revalidate and re-map SIDs in the object security blobs prior to each use since they would be global, eliminating the complexity associated with __inode_security_revalidate(), sbsec_revalidate(), and doing the same for all the remaining object security blobs. - It would remove the need to instantiate the network SID caches (netif, netnode, netport) per-namespace. - It would potentially allow for a global AVC (currently per-namespace) aside from the question of how to handle the policy seqno and when to flush the AVC (e.g. introduce a global policy seqno that is incremented on any reload in any namespace?). Given these advantages, I am inclined to switch to approach #2. Thoughts?