On Mon, Mar 20, 2023 at 10:46 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote: > > On Mon, Mar 20, 2023 at 3:19 PM Dominick Grift > <dominick.grift@xxxxxxxxxxx> wrote: > > > > Ondrej Mosnacek <omosnace@xxxxxxxxxx> writes: > > > > > On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift > > > <dominick.grift@xxxxxxxxxxx> wrote: > > >> > > >> Stephen Smalley <stephen.smalley.work@xxxxxxxxx> writes: > > >> > > >> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift > > >> > <dominick.grift@xxxxxxxxxxx> wrote: > > >> >> > > >> >> > > >> >> Hi, > > >> >> > > >> >> I was reading this pull request [1] and looked into how I might be able > > >> >> to implement this in policy but there seem to be some technical > > >> >> difficulties. > > >> >> > > >> >> * I already use getfscon to seperate the systemd user.slice because the > > >> >> system manager delegates the user.slice to the user manager. > > >> >> > > >> >> (genfscon "cgroup2" "/user.slice" cgroupfile_context) > > >> >> > > >> >> In the past the proved to be a racy where systemd attempts to > > >> >> write before the object has the context associated with the genfscon. > > >> > > > >> > I don't understand how this could be racy - genfscon-assigned contexts > > >> > should be assigned when the dentry is first instantiated via > > >> > inode_donit_with_dentry and therefore the inode shouldn't be > > >> > accessible to userspace prior to this initial assignment AFAIK. > > >> > Possibly I am missing something. > > >> > > >> I recall encountering this sporadically, but I admit that it has been a > > >> while since I supressed it in policy. I might try to reproduce. AFAIK my > > >> policy is the only policy that actually labels some trees on cgroup2 fs > > >> with private types currently. > > >> > > >> > > > >> >> I decided to dontaudit attempts to write to the mislabeled object and > > >> >> it *seems* as if systemd retries until it can write it i.e. when the > > >> >> object carries the expected label and so that seems to work eventually > > >> >> but it looks fragile. > > >> >> > > >> >> * The challenge with memory pressure implementation [2] is that these > > >> >> "memory.pressure" files end up in random locations under > > >> >> "/system.slice" for example: > > >> >> > > >> >> /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure > > >> >> > > >> >> Where in the above systemd-journald.service might be > > >> >> templated (systemd-journald@FOO.service). Point is that the path is > > >> >> random. genfscon does not support regex and glob. I can't do for example: > > >> >> > > >> >> (genfscon "cgroup2" "/system.slice/.*/memory.pressure" > > >> >> cgroupfile_context) > > >> >> > > >> >> Fortunately cgroup2fs supports relabeling but if systemd has to > > >> >> manually relabel the cgroup files then I would imagine that this is > > >> >> racy as well, and that does not really solve the underlying issue. > > >> >> > > >> >> I am looking for ideas and suggestions > > >> > > > >> > Optimally one of two things would happen: > > >> > 1. The kernel would label the inode correctly when it is first created > > >> > (e.g. by augmenting genfscon to support more general matching), or > > >> > 2. The userspace component that creates these files would label them > > >> > correctly at creation (via setfscreatecon() prior to creation). > > >> > > >> Agree but 1. would require regex/glob support for genfscon and 2. these > > >> files aren't "created" by userspace AFAIK and so setfscreatecon or > > >> automatic object type transitions are probably not an option here. > > >> > > >> > > > >> > Pardon my ignorance but what creates these files initially? The kernel > > >> > in response to some event or systemd or some other userspace > > >> > component? > > >> > > >> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs > > >> in that sense). This is also why I don't think that the PR mentioned is > > >> tested because cgroup2 fs labeling is done with genfscon and not fsuse > > >> trans or fsuse xattr so even if the files would be created by > > >> userspace (which I think is not the case) the specified automatic object > > >> type transition rule wouldnt work. > > > > > > Actually, type transitions on cgroupfs should work - I added special > > > hooks for kernfs just for that some time ago - see kernel commits > > > d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f. > > > > Interesting. I will try this out. Would this not require at least a > > "fsuse trans" statement in policy? > > No, it should work alongside genfscon. cgroupfs already was special > before that as it allowed relabeling despite genfscon being used. > > > > > https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89 > > > > Also I am not sure if that support would make much sense on a filesystem > > where files are created my the kernel in reaction to some event. > > It does make sense with named transitions, plus it was needed to make > even a simple parent-child inheritance work. Also, I believe some > cgroupfs files/directories (I think only directories?) can be created > by userspace, too. We should likely check that the SELinux Notebook and/or other documentation reflects this support and which filesystem types are supported, both wrt the filesystem types that support both genfscon + setxattr and those that support genfscon+setxattr+type_transition rules.