Re: SELinux namespaces re-base

Stephen Smalley <stephen.smalley.work@xxxxxxxxx> · Thu, 8 Aug 2024 07:59:14 -0400

On Wed, Aug 7, 2024 at 1:02 PM Stephen Smalley
<stephen.smalley.work@xxxxxxxxx> wrote:
>
> On Tue, Aug 6, 2024 at 12:56 PM Stephen Smalley
> <stephen.smalley.work@xxxxxxxxx> wrote:
> > With these changes applied, the following sequence works to
> > demonstrate creating a new SELinux namespace:
> > # Ask to unshare SELinux namespace on next exec
> > $ echo 1 > /sys/fs/selinux/unshare
> > # Unshare the mount and network namespaces too.
> > # This is required in order to create our own selinuxfs mount for the
> > # new namespace and to isolate our own SELinux netlink socket.
> > $ unshare -m -n
> > # Mount our own selinuxfs instance for our new SELinux namespace
> > $ mount -t selinuxfs none /sys/fs/selinux
> > # Load a policy into our SELinux namespace
> > $ load_policy
> > # Create a shell in the unconfined user/role/domain
> > $ runcon unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 /bin/bash
> > $ setenforce 1
> > $ id
> > uid=0(root) gid=0(root) groups=0(root)
> > context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> >
> > NB This new namespace is NOT currently confined by its parent. And
> > there remain many unresolved issues.
>
> A couple of additional changes pushed, one to fix a bug in the inode
> handling and another to introduce support for revalidating superblock
> SIDs and updating them as needed for the namespace. With these
> changes, the selinux-testsuite filesystem-related tests appear to pass
> within a new SELinux namespace. Other tests vary - some pass, some
> fail, some hang.

I think before we proceed further with the SELinux namespaces support,
we need to decide on how we are going to handle SIDs since that has a
significant impact on the approach. There are (at least) two options:
1) SIDs are maintained per-namespace; this is the current approach in
the patch series since the existing SELinux SID table (sidtab) is
actually a mapping from SIDs to security context structures, not
strings (with the exception of undefined contexts with the deferred
mapping support), where the structures contain the policy indices for
the relevant user/role/type/level.
2) SIDs are maintained globally, e.g. we introduce a new SID table
outside of the security server that maps SIDs to security context
strings (hence policy-independent). This would be more akin to Smack's
known label list that is also used to assign SIDs, and would provide a
stable pointer for context strings that could be cached in the inode
security blobs without needing to maintain per-inode copies of the
context strings.

I started with approach #1 because that was how the existing SID table
works within SELinux. However, approach #2 has a number of advantages:
- It matches the LSM hook interface handling of secids, which assume
that secids are global identifiers and allows kernel data structures
outside of LSM to cache and pass secids back into the LSM later, e.g.
for audit and networking.
- It avoids the need to revalidate and re-map SIDs in the object
security blobs prior to each use since they would be global,
eliminating the complexity associated with
__inode_security_revalidate(), sbsec_revalidate(), and doing the same
for all the remaining object security blobs.
- It would remove the need to instantiate the network SID caches
(netif, netnode, netport) per-namespace.
- It would potentially allow for a global AVC (currently
per-namespace) aside from the question of how to handle the policy
seqno and when to flush the AVC (e.g. introduce a global policy seqno
that is incremented on any reload in any namespace?).

Given these advantages, I am inclined to switch to approach #2. Thoughts?