Re: cgroup2 labeling question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ondrej Mosnacek <omosnace@xxxxxxxxxx> writes:

> On Mon, Mar 20, 2023 at 2:59 PM Dominick Grift
> <dominick.grift@xxxxxxxxxxx> wrote:
>>
>> Stephen Smalley <stephen.smalley.work@xxxxxxxxx> writes:
>>
>> > On Mon, Mar 20, 2023 at 3:25 AM Dominick Grift
>> > <dominick.grift@xxxxxxxxxxx> wrote:
>> >>
>> >>
>> >> Hi,
>> >>
>> >> I was reading this pull request [1] and looked into how I might be able
>> >> to implement this in policy but there seem to be some technical
>> >> difficulties.
>> >>
>> >> * I already use getfscon to seperate the systemd user.slice because the
>> >>   system manager delegates the user.slice to the user manager.
>> >>
>> >>   (genfscon "cgroup2" "/user.slice" cgroupfile_context)
>> >>
>> >>   In the past the proved to be a racy where systemd attempts to
>> >>   write before the object has the context associated with the genfscon.
>> >
>> > I don't understand how this could be racy - genfscon-assigned contexts
>> > should be assigned when the dentry is first instantiated via
>> > inode_donit_with_dentry and therefore the inode shouldn't be
>> > accessible to userspace prior to this initial assignment AFAIK.
>> > Possibly I am missing something.
>>
>> I recall encountering this sporadically, but I admit that it has been a
>> while since I supressed it in policy. I might try to reproduce. AFAIK my
>> policy is the only policy that actually labels some trees on cgroup2 fs
>> with private types currently.
>>
>> >
>> >>   I decided to dontaudit attempts to write to the mislabeled object and
>> >>   it *seems* as if systemd retries until it can write it i.e. when the
>> >>   object carries the expected label and so that seems to work eventually
>> >>   but it looks fragile.
>> >>
>> >> * The challenge with memory pressure implementation [2] is that these
>> >>   "memory.pressure" files end up in random locations under
>> >>   "/system.slice" for example:
>> >>
>> >>   /sys/fs/cgroup/system.slice/systemd-journald.service/memory.pressure
>> >>
>> >>   Where in the above systemd-journald.service might be
>> >>   templated (systemd-journald@FOO.service). Point is that the path is
>> >>   random. genfscon does not support regex and glob. I can't do for example:
>> >>
>> >>   (genfscon "cgroup2" "/system.slice/.*/memory.pressure"
>> >>   cgroupfile_context)
>> >>
>> >>   Fortunately cgroup2fs supports relabeling but if systemd has to
>> >>   manually relabel the cgroup files then I would imagine that this is
>> >>   racy as well, and that does not really solve the underlying issue.
>> >>
>> >>   I am looking for ideas and suggestions
>> >
>> > Optimally one of two things would happen:
>> > 1. The kernel would label the inode correctly when it is first created
>> > (e.g. by augmenting genfscon to support more general matching), or
>> > 2. The userspace component that creates these files would label them
>> > correctly at creation (via setfscreatecon() prior to creation).
>>
>> Agree but 1. would require regex/glob support for genfscon and 2. these
>> files aren't "created" by userspace AFAIK and so setfscreatecon or
>> automatic object type transitions are probably not an option here.
>>
>> >
>> > Pardon my ignorance but what creates these files initially? The kernel
>> > in response to some event or systemd or some other userspace
>> > component?
>>
>> Yes AFAIK it is the former (psuedo filesystem similar to procfs, debugfs
>> in that sense). This is also why I don't think that the PR mentioned is
>> tested because cgroup2 fs labeling is done with genfscon and not fsuse
>> trans or fsuse xattr so even if the files would be created by
>> userspace (which I think is not the case) the specified automatic object
>> type transition rule wouldnt work.
>
> Actually, type transitions on cgroupfs should work - I added special
> hooks for kernfs just for that some time ago - see kernel commits
> d0c9c153b4bd6963c8fcccbc0caa12e8fa8d971d..e19dfdc83b60f196e0653d683499f7bc5548128f.

Interesting. I will try this out. Would this not require at least a
"fsuse trans" statement in policy?

https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/filesystem.te#L89

Also I am not sure if that support would make much sense on a filesystem
where files are created my the kernel in reaction to some event.

>
> Not sure what's behind the genfscon label assignment race, though.
>
>>
>> I think eventually we currently probably have little choice but to make systemd
>> reset the context of said cgroup file manually. Just wanted to see if
>> there are alternatives.
>>
>> >
>> >> [1] https://github.com/SELinuxProject/refpolicy/pull/607
>> >> [2] https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md
>>
>> --
>> gpg --locate-keys dominick.grift@xxxxxxxxxxx
>> Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
>> Dominick Grift
>>

-- 
gpg --locate-keys dominick.grift@xxxxxxxxxxx
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
Dominick Grift




[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux