Re: SELinux namespaces re-base

Stephen Smalley <stephen.smalley.work@xxxxxxxxx> · Thu, 1 Aug 2024 15:28:40 -0400

On Thu, Aug 1, 2024 at 2:48 PM Stephen Smalley
<stephen.smalley.work@xxxxxxxxx> wrote:
>
> On Thu, Aug 1, 2024 at 10:03 AM Stephen Smalley
> <stephen.smalley.work@xxxxxxxxx> wrote:
> >
> > On Thu, Aug 1, 2024 at 9:01 AM Stephen Smalley
> > <stephen.smalley.work@xxxxxxxxx> wrote:
> > >
> > > On Thu, Aug 1, 2024 at 8:27 AM Stephen Smalley
> > > <stephen.smalley.work@xxxxxxxxx> wrote:
> > > >
> > > > Given the recent discussion of the SELinux namespaces patches, I
> > > > re-based the working-selinuxns branch of my selinux-kernel fork on top
> > > > of the dev branch. This required first reverting commit e67b79850fcc
> > > > ("selinux: stop passing selinux_state pointers and their offspring")
> > > > which I had created at Linus' request some time ago to avoid the
> > > > extraneous overhead associated with passing those pointers when there
> > > > could only be one selinux_state structure. Due to the number of
> > > > changes, both substantive and coding style related, since the last
> > > > re-base in 2020, there were numerous conflicts that required manual
> > > > resolution. I also checked the coding style of each patch with Paul's
> > > > scripts and fixed any issues introduced by the patches along the way.
> > > >
> > > > The rebase can be found at:
> > > > https://github.com/stephensmalley/selinux-kernel/tree/working-selinuxns
> > > >
> > > > It boots, passes the selinux-testsuite (including the NFS tests), and
> > > > passes the following
> > > > trivial exercising of the unshare mechanism:
> > > > $ sudo bash
> > > > # echo 1 > /sys/fs/selinux/unshare
> > > > # unshare -m -n
> > > > # umount /sys/fs/selinux
> > > > # mount -t selinuxfs none /sys/fs/selinux
> > > > # id
> > > > uid=0(root) gid=0(root) groups=0(root) context=kernel
> > > > # getenforce
> > > > Permissive
> > > > # load_policy
> > > > # id
> > > > uid=0(root) gid=0(root) groups=0(root) context=system_u:system_r:kernel_t:s0
> > > >
> > > > All the same caveats apply - this is still not safe to use and has
> > > > many unresolved issues as noted in the patch descriptions.
> > >
> > > Also, note that as before, this branch does NOT include the original
> > > patches to support per-namespace superblock and inode security
> > > structures. Given the known problems with those patches and recent
> > > discussions, we likely don't want them anyway, but for reference they
> > > can still be found here:
> > > https://github.com/stephensmalley/selinux-kernel/commit/3378718ef7d4a837f32c63bdfcc0b70342cdd55d
> > > https://github.com/stephensmalley/selinux-kernel/commit/efb2ddadfdd0e10e75b6aa5da2ed9841df6ef2f6
> > >
> > > Without those patches, ls -Z will show as unlabeled any files whose
> > > inodes were already in-core at the time of the creation of the new
> > > SELinux namespace because their inode security structures will have
> > > SIDs that do not exist in the new namespace's SID table.
> > >
> > > An alternative approach proposed by Huawei to my original patches can
> > > be found here:
> > > https://lore.kernel.org/selinux/20220216125206.20975-1-igor.baranov@xxxxxxxxxx/#r
> > >
> > > However, those patches also seem to have quite a few unresolved issues.
> >
> > Actually, re-reading that thread inspired me to take one of the ideas
> > suggested by Huawei, so I just pushed up one change on top of my
> > working-selinuxns branch to support saving the SELinux namespace in
> > the inode security blob and re-initializing it when an inode is
> > accessed by a process in a different SELinux namespace. This at least
> > allows ls -Z to correctly show the security contexts of files even
> > when they are already in-core.
> > There still remain many issues though to resolve...
>
> As just one example, if you setenforce 1 in the new SELinux namespace,
> you'll instantly lose access to your open files since the file
> security blobs (not the inode security blobs) still have SIDs that
> were created in the parent namespace and aren't defined in yours. And
> naively changing those to match your namespace will break access by
> the parent process. Probably need to create a helper program akin to
> newrole that handles closing and re-opening the fds in the new
> namespace.
>
> Then you still have the problem of the inode security blobs not being
> refreshed on certain code paths because you can't sleep on those code
> paths and fetching the xattr may sleep. Might require saving a copy of
> the security context string in the inode security blob so that you
> don't have to perform any blocking operations to refresh the SID to
> match the current namespace, but that will come at an obvious cost in
> memory.

The last paragraph above actually rears its head even without fixing
the fd problem;
unfortunately that last patch I pushed on top of working-selinuxns
will render you unable to ssh into the system if/when you end up
creating a new SELinux namespace and accessing files needed by sshd.
It seems that the patch is updating the inode SID and namespace to the
new namespace and then sshd is trying to access those files through a
code path that cannot call  inode_security_revalidate() with
may_sleep==true and hence ends up using the wrong SID and being denied
access.

Smack's equivalent to the SID table allowed one to safely pass around
pointers to the label string, which was never de-allocated, so it only
needed a single copy in memory of each label. We could add something
similar outside the security server in SELinux so that
inode_security_revalidate() could re-map the context string to an
appropriate SID without requiring sleeping.