On Tue, Sep 3, 2024 at 10:58 AM Stephen Smalley <stephen.smalley.work@xxxxxxxxx> wrote: > For those following along, I have pushed a commit that introduces a > global SID table [1]. > It does not yet change any code to start using this global SID table, > so that's next. > Rather than introduce yet another data structure, I reused the > existing SID table structures and code. > For the global SID table, we only use the SID and the context str, len > fields for all entries. > If we later decide to optimize the global SID table more specifically, > that can be done easily enough. I finally got everything that requires global SIDs to use the new global SID table, now available in my branch, https://github.com/stephensmalley/selinux-kernel/tree/working-selinuxns This proved to be more involved than I had anticipated; there are a number of subtleties in our handling of contexts and NetLabel was caching SIDs in two locations, one of which is read outside the security server (hence, required a global SID) and the other one is only directly read/written inside the security server (hence, can be mapped to/from global SIDs at the security server interface). With these changes, I could drop the changes for revalidating and updating inode, superblock, and open file SIDs per-namespace, so those have been dropped from the branch although still available in a working-selinuxns-beforeglobalsids branch for reference. The SELinux testsuite passes, including NFS tests, in the initial SELinux namespace, and everything except for the socket/networking-related and mac_admin tests pass in a child SELinux namespace. The socket/networking-related test failures are unsurprising, partly due to needing to also unshare the network namespace to avoid the SELinux netlink notifications from confusing the initial namespace's AVC (wrong/out-of-order policy seqno) and partly due to the fact that we do not yet have a way to associate the SELinux namespace for use in hooks that occur outside of process context. The mac_admin test failures are likewise unsurprising (getting/setting unmapped context values); I had similar issues with those tests even in the initial namespace before making some further specialized changes; I will see if I can get that to work properly in the child namespace too. Going back to the list of known issues to resolve and omitting the ones resolved by having global SIDs, we are left with the following: 1. Updating hook functions called outside of process context, e.g. task_kill, send_sigiotask, network input/forward, to use the correct SELinux namespace instead of using the current one; this requires storing a pointer to the SELinux namespace in some relevant data structure from which we can fetch it in those hook functions, e.g. the file or socket security blobs. 2. Updating the SELinux hook functions to check permissions against all ancestor namespaces rather than just the current one, and consider introducing a top-level global AVC to avoid the need to check against each per-namespace AVC on every check. 3. Providing a way to restrict or bound nesting of SELinux namespaces, particularly given the resource usage associated with loading a policy per-namespace and having a per-namespace AVC, sidtab, etc. 4. Hardening the policy loading code and other selinuxfs interfaces to support potentially unprivileged usage by child namespaces. 5. Optimizing the namespace support, global SID table, etc to avoid imposing significant overheads especially for the case where there is only a single namespace since that will likely remain the common case. Reminder on usage for anyone who wants to play with it: # Unshare the SELinux namespace echo 1 > /sys/fs/selinux/unshare # Unshare the mount and network namespaces # Required so that we can have our own private selinuxfs mount and # so that we can have our own private NETLINK_SELINUX socket. unshare -m -n # Unmount the parent namespace's selinuxfs and mount our new one. umount /sys/fs/selinux mount -t selinuxfs none /sys/fs/selinux # Load a policy into our namespace load_policy # Switch to a safe context runcon unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 /bin/bash # Go enforcing setenforce 1