On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift <dominick.grift@xxxxxxxxxxx> wrote: > > Stephen Smalley <stephen.smalley.work@xxxxxxxxx> writes: > > > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift > > <dominick.grift@xxxxxxxxxxx> wrote: > >> > >> > >> Hi, > >> > >> I was reading this pull request [1] and looked into how I might be able > >> to implement this in policy but there seem to be some technical > >> difficulties. > >> > >> * I already use getfscon to seperate the systemd user.slice because the > >> system manager delegates the user.slice to the user manager. > >> > >> (genfscon "cgroup2" "/user.slice" cgroupfile_context) > >> > >> In the past the proved to be a racy where systemd attempts to > >> write before the object has the context associated with the genfscon. > > > > I don't understand how this could be racy - genfscon-assigned contexts > > should be assigned when the dentry is first instantiated via > > inode_donit_with_dentry and therefore the inode shouldn't be > > accessible to userspace prior to this initial assignment AFAIK. > > Possibly I am missing something. > > I recall encountering this sporadically, but I admit that it has been a > while since I supressed it in policy. I might try to reproduce. AFAIK my > policy is the only policy that actually labels some trees on cgroup2 fs > with private types currently. > > > > >> I decided to dontaudit attempts to write to the mislabeled object and > >> it *seems* as if systemd retries until it can write it i.e. when the > >> object carries the expected label and so that seems to work eventually > >> but it looks fragile. > >> > >> * The challenge with memory pressure implementation [2] is that these > >> "memory.pressure" files end up in random locations under > >> "/system.slice" for example: > >> > >> /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure > >> > >> Where in the above systemd-journald.service might be > >> templated (systemd-journald@FOO.service). Point is that the path is > >> random. genfscon does not support regex and glob. I can't do for example: > >> > >> (genfscon "cgroup2" "/system.slice/.*/memory.pressure" > >> cgroupfile_context) > >> > >> Fortunately cgroup2fs supports relabeling but if systemd has to > >> manually relabel the cgroup files then I would imagine that this is > >> racy as well, and that does not really solve the underlying issue. > >> > >> I am looking for ideas and suggestions > > > > Optimally one of two things would happen: > > 1. The kernel would label the inode correctly when it is first created > > (e.g. by augmenting genfscon to support more general matching), or > > 2. The userspace component that creates these files would label them > > correctly at creation (via setfscreatecon() prior to creation). > > Agree but 1. would require regex/glob support for genfscon and 2. these > files aren't "created" by userspace AFAIK and so setfscreatecon or > automatic object type transitions are probably not an option here. > > > > > Pardon my ignorance but what creates these files initially? The kernel > > in response to some event or systemd or some other userspace > > component? > > Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs > in that sense). This is also why I don't think that the PR mentioned is > tested because cgroup2 fs labeling is done with genfscon and not fsuse > trans or fsuse xattr so even if the files would be created by > userspace (which I think is not the case) the specified automatic object > type transition rule wouldnt work. Actually, type transitions on cgroupfs should work - I added special hooks for kernfs just for that some time ago - see kernel commits d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f. Not sure what's behind the genfscon label assignment race, though. > > I think eventually we currently probably have little choice but to make systemd > reset the context of said cgroup file manually. Just wanted to see if > there are alternatives. > > > > >> [1] https://github.com/SELinuxProject/refpolicy/pull/607 > >> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md > > -- > gpg --locate-keys dominick.grift@xxxxxxxxxxx > Key fingerprint = FCD2 3660 5D6B 9D27 7FC6 E0FF DA7E 521F 10F6 4098 > Dominick Grift > -- Ondrej Mosnacek Senior Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.