On Mon, Oct 16, 2023 at 6:08 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > [ > That thing sits in viro/vfs.git#work.selinuxfs; I have > lock_rename()-related followups in another branch, so a pull would be more > convenient for me than cherry-pick. NOTE: testing and comments would > be very welcome - as it is, the patch is pretty much untested beyond > "it builds". > ] > > On policy reload selinuxfs replaces two subdirectories (/booleans > and /class) with new variants. Unfortunately, that's done with > serious abuses of directory locking. > > 1) lock_rename() should be done to parents, not to objects being > exchanged > > 2) there's a bunch of reasons why it should not be done for directories > that do not have a common ancestor; most of those do not apply to > selinuxfs, but even in the best case the proof is subtle and brittle. > > 3) failure halfway through the creation of /class will leak > names and values arrays. > > 4) use of d_genocide() is also rather brittle; it's probably not much of > a bug per se, but e.g. an overmount of /sys/fs/selinuxfs/classes/shm/index > with any regular file will end up with leaked mount on policy reload. > Sure, don't do it, but... > > Let's stop messing with disconnected directories; just create > a temporary (/.swapover) with no permissions for anyone (on the > level of ->permission() returing -EPERM, no matter who's calling > it) and build the new /booleans and /class in there; then > lock_rename on root and that temporary directory and d_exchange() > old and new both for class and booleans. Then unlock and use > simple_recursive_removal() to take the temporary out; it's much > more robust. > > And instead of bothering with separate pathways for freeing > new (on failure halfway through) and old (on success) names/values, > do all freeing in one place. With temporaries swapped with the > old ones when we are past all possible failures. > > The only user-visible difference is that /.swapover shows up > (but isn't possible to open, look up into, etc.) for the > duration of policy reload. Thanks Al. Giving this a very quick look, I like the code simplifications that come out of this change and I'll trust you on the idea that this approach is better from a VFS perspective. While the reject_all() permission hammer is good, I do want to make sure we are covered from a file labeling perspective; even though the DAC/reject_all() check hits first and avoids the LSM inode permission hook, we still want to make sure the files are labeled properly. It looks like given the current SELinux Reference Policy this shouldn't be a problem, it will be labeled like most everything else in selinuxfs via genfscon (SELinux policy construct). I expect those with custom SELinux policies will have something similar in place with a sane default that would cover the /sys/fs/selinux/.swapover directory but I did add the selinux-refpol list to the CC line just in case I'm being dumb and forgetting something important with respect to policy. The next step is to actually boot up a kernel with this patch and make sure it doesn't break anything. Simply booting up a SELinux system and running 'load_policy' a handful of times should exercise the policy (re)load path, and if you want a (relatively) simple SELinux test suite you can find one here: * https://github.com/SELinuxProject/selinux-testsuite The README.md should have the instructions necessary to get it running. If you can't do that, and no one else on the mailing list is able to test this out, I'll give it a go but expect it to take a while as I'm currently swamped with reviews and other stuff. -- paul-moore.com