On Tue, Sep 10, 2024 at 3:13 PM Stephen Smalley <stephen.smalley.work@xxxxxxxxx> wrote: > I finally got everything that requires global SIDs to use the new > global SID table, now available in my branch, > https://github.com/stephensmalley/selinux-kernel/tree/working-selinuxns > > This proved to be more involved than I had anticipated; there are a > number of subtleties in our handling of contexts and NetLabel was > caching SIDs in two locations, one of which is read outside the > security server (hence, required a global SID) and the other one is > only directly read/written inside the security server (hence, can be > mapped to/from global SIDs at the security server interface). > > With these changes, I could drop the changes for revalidating and > updating inode, superblock, and open file SIDs per-namespace, so those > have been dropped from the branch although still available in a > working-selinuxns-beforeglobalsids branch for reference. > > The SELinux testsuite passes, including NFS tests, in the initial > SELinux namespace, and everything except for the > socket/networking-related and mac_admin tests pass in a child SELinux > namespace. The socket/networking-related test failures are > unsurprising, partly due to needing to also unshare the network > namespace to avoid the SELinux netlink notifications from confusing > the initial namespace's AVC (wrong/out-of-order policy seqno) and > partly due to the fact that we do not yet have a way to associate the > SELinux namespace for use in hooks that occur outside of process > context. The mac_admin test failures are likewise unsurprising > (getting/setting unmapped context values); I had similar issues with > those tests even in the initial namespace before making some further > specialized changes; I will see if I can get that to work properly in > the child namespace too. > > Going back to the list of known issues to resolve and omitting the > ones resolved by having global SIDs, we are left with the following: > > 1. Updating hook functions called outside of process context, e.g. > task_kill, send_sigiotask, network input/forward, to use the correct > SELinux namespace instead of using the current one; this requires > storing a pointer to the SELinux namespace in some relevant data > structure from which we can fetch it in those hook functions, e.g. the > file or socket security blobs. Pushed a revised commit to address the lingering mac_admin test failure and added a new commit for #1 above, fixing the hooks called outside of process context to use the correct SELinux namespace with the exception of a couple xfrm hooks that don't appear to have ready access to any object from which I can obtain the SELinux namespace (e.g. no sock structure, just the xfrm and flow structures with no place to save a namespace for later reference AFAICT). I can now successfully run all of the selinux-testsuite except for the inet_socket, sctp, and extended_socket_class tests from within a child SELinux namespace created as per the instructions at the end of the email. The extended_socket_class tests only fail on one test (creating an AF_BLUETOOTH socket) with an "Address family not supported by protocol" error that I suspect is merely due to running within a non-init network namespace too (to avoid confusing the parent SELinux namespace with netlink SELinux notifications from the child; failing to unshare network namespace caused userspace policy enforcers to go crazy when they received a policy seqno greater than their own namespace's policy seqno). Similarly, most if not all of the inet and sctp failures appear to be due to running in a separate network namespace; netlabelctl falls over immediately with an error during initialization. Next up is tackling #2 below. > 2. Updating the SELinux hook functions to check permissions against > all ancestor namespaces rather than just the current one, and consider > introducing a top-level global AVC to avoid the need to check against > each per-namespace AVC on every check. > > 3. Providing a way to restrict or bound nesting of SELinux namespaces, > particularly given the resource usage associated with loading a policy > per-namespace and having a per-namespace AVC, sidtab, etc. > > 4. Hardening the policy loading code and other selinuxfs interfaces to > support potentially unprivileged usage by child namespaces. > > 5. Optimizing the namespace support, global SID table, etc to avoid > imposing significant overheads especially for the case where there is > only a single namespace since that will likely remain the common case. > > Reminder on usage for anyone who wants to play with it: > # Unshare the SELinux namespace > echo 1 > /sys/fs/selinux/unshare > # Unshare the mount and network namespaces > # Required so that we can have our own private selinuxfs mount and > # so that we can have our own private NETLINK_SELINUX socket. > unshare -m -n > # Unmount the parent namespace's selinuxfs and mount our new one. > umount /sys/fs/selinux > mount -t selinuxfs none /sys/fs/selinux > # Load a policy into our namespace > load_policy > # Switch to a safe context > runcon unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 /bin/bash > # Go enforcing > setenforce 1