On Thu, Aug 1, 2024 at 10:03 AM Stephen Smalley <stephen.smalley.work@xxxxxxxxx> wrote: > > On Thu, Aug 1, 2024 at 9:01 AM Stephen Smalley > <stephen.smalley.work@xxxxxxxxx> wrote: > > > > On Thu, Aug 1, 2024 at 8:27 AM Stephen Smalley > > <stephen.smalley.work@xxxxxxxxx> wrote: > > > > > > Given the recent discussion of the SELinux namespaces patches, I > > > re-based the working-selinuxns branch of my selinux-kernel fork on top > > > of the dev branch. This required first reverting commit e67b79850fcc > > > ("selinux: stop passing selinux_state pointers and their offspring") > > > which I had created at Linus' request some time ago to avoid the > > > extraneous overhead associated with passing those pointers when there > > > could only be one selinux_state structure. Due to the number of > > > changes, both substantive and coding style related, since the last > > > re-base in 2020, there were numerous conflicts that required manual > > > resolution. I also checked the coding style of each patch with Paul's > > > scripts and fixed any issues introduced by the patches along the way. > > > > > > The rebase can be found at: > > > https://github.com/stephensmalley/selinux-kernel/tree/working-selinuxns > > > > > > It boots, passes the selinux-testsuite (including the NFS tests), and > > > passes the following > > > trivial exercising of the unshare mechanism: > > > $ sudo bash > > > # echo 1 > /sys/fs/selinux/unshare > > > # unshare -m -n > > > # umount /sys/fs/selinux > > > # mount -t selinuxfs none /sys/fs/selinux > > > # id > > > uid=0(root) gid=0(root) groups=0(root) context=kernel > > > # getenforce > > > Permissive > > > # load_policy > > > # id > > > uid=0(root) gid=0(root) groups=0(root) context=system_u:system_r:kernel_t:s0 > > > > > > All the same caveats apply - this is still not safe to use and has > > > many unresolved issues as noted in the patch descriptions. > > > > Also, note that as before, this branch does NOT include the original > > patches to support per-namespace superblock and inode security > > structures. Given the known problems with those patches and recent > > discussions, we likely don't want them anyway, but for reference they > > can still be found here: > > https://github.com/stephensmalley/selinux-kernel/commit/3378718ef7d4a837f32c63bdfcc0b70342cdd55d > > https://github.com/stephensmalley/selinux-kernel/commit/efb2ddadfdd0e10e75b6aa5da2ed9841df6ef2f6 > > > > Without those patches, ls -Z will show as unlabeled any files whose > > inodes were already in-core at the time of the creation of the new > > SELinux namespace because their inode security structures will have > > SIDs that do not exist in the new namespace's SID table. > > > > An alternative approach proposed by Huawei to my original patches can > > be found here: > > https://lore.kernel.org/selinux/20220216125206.20975-1-igor.baranov@xxxxxxxxxx/#r > > > > However, those patches also seem to have quite a few unresolved issues. > > Actually, re-reading that thread inspired me to take one of the ideas > suggested by Huawei, so I just pushed up one change on top of my > working-selinuxns branch to support saving the SELinux namespace in > the inode security blob and re-initializing it when an inode is > accessed by a process in a different SELinux namespace. This at least > allows ls -Z to correctly show the security contexts of files even > when they are already in-core. > There still remain many issues though to resolve... As just one example, if you setenforce 1 in the new SELinux namespace, you'll instantly lose access to your open files since the file security blobs (not the inode security blobs) still have SIDs that were created in the parent namespace and aren't defined in yours. And naively changing those to match your namespace will break access by the parent process. Probably need to create a helper program akin to newrole that handles closing and re-opening the fds in the new namespace. Then you still have the problem of the inode security blobs not being refreshed on certain code paths because you can't sleep on those code paths and fetching the xattr may sleep. Might require saving a copy of the security context string in the inode security blob so that you don't have to perform any blocking operations to refresh the SID to match the current namespace, but that will come at an obvious cost in memory.