On Mon, Apr 27, 2020 at 4:40 PM Daniel Burgener <dburgener@xxxxxxxxxxxxxxxxxxx> wrote: > > Hello all, > > We've noticed a few instances of strange failures in userspace object > managers happening immediately after a policy load, that we believe > we've traced to a race condition in the kernel, and wanted to get your > thoughts on our diagnosis and the appropriate fix before creating a > patch in case we've missed something. > > The issue is that userspace object managers rely on /sys/fs/selinux to > determine the mapping of object class and permission strings to numeric > identifiers, however immediately after a policy load, > /sys/fs/selinux/{booleans,class,policy_capabilities}, are deleted and > recreated. This leaves a window where the object classes (etc) appear > to be not defined, even if they are defined in both the old and new > policies. > > We have observed this with both dbus and systemd userspace object > managers, and it can be reproduced straightforwardly by running the > following (borrowed from bug linked below): > > # (while true; do cat /sys/fs/selinux/class/service/index >/dev/null; > done) & > # while true; do sudo load_policy; echo -n .; sleep 0.1;done > > Periodically, we will get "No such file or directory" messages printed > to stderr. In the event of a userspace object manager using libselinux > to check a userspace permission, that will result in a USER_AVC message > indicating an unknown object class, and in the event that > --handle-unknown is set to "deny", it will deny the access. > > It seems to me as though some sort of locking should occur in the > selinuxfs to prevent access to the files being deleted and recreated > during the policy load, so that userspace programs relying on them (in > particular userspace object managers doing class lookups) get a > consistent view of the classes, perms, booleans and capabilities in the > loaded policy. > > This seems to be related to > https://github.com/SELinuxProject/selinux-kernel/issues/42 but I believe > it is a different case. The deadlock in that bug seems to be related to > the underlying filesystem functions, specifically around directory > deletion while this is an issue the selinuxfs logic specifically. The > above linked issue appears to have been fixed in recent upstream > kernels, per the bug, but I have verified the issue I am discussing here > in 5.7.0 rc3. > > It seems to me as though from the perspective of userspace that all of > sel_make_policy_nodes (or at least all of each of its component > functions) should be atomic. There was some discussion in a previous > thread > (https://lore.kernel.org/selinux/20181002155810.GP32577@xxxxxxxxxxxxxxxxxx/) > around a significant refactor of policy loading in general. It appears > as though the direct issue there of access during the deletion has been > resolved (commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999), although the > complete suggested fix of refactoring policy_load into two parts has not > been done. Would that refactor be the right approach to the problem I > am trying to solve? Would a patch for adding locking around the > selinuxfs delete/recreate operation be considered? That wouldn't address > all the concerns, (namely the potential to access a view of the policy > that doesn't match the currently loaded policy and error recovery in the > case that sel_make_nodes fails), but would improve the reliability of > existing userspace object managers > > I'm happy to create and submit a patch, but I wanted to get the > communities thought on the problem and correct approach first. I think the best approach is the one sketched by viro but I recognize that's a big lift. Willing to help but not entirely clear on the interactions with the dcache for atomically swapping the new trees into place. I suspect this is just being exposed now due to more recent changes in userspace to try to fully support changes in class/permission values at runtime. Previously userspace object managers tended to only map them during initialization or on first use and then would just keep using the cached values, whereas now they try to map them lazily and flush their caches on a reload notification. Some of those changes were in libselinux and others in the userspace object managers (e.g. systemd, dbus-broker or dbusd). Not sure exactly what versions of libselinux, systemd, and dbusd/dbus-broker you are using. selinuxfs itself does take a mutex around the entire policy load including the delete/re-create sequence but that only serializes with other policy operations (load, read, booleans), not with dcache walk/lookup. cc'ing Ondrej since he attempted to fix the earlier selinuxfs bug and may have an opinion on the best way forward for this issue.