On Wed 22-03-23 14:25:25, Florian Schmidt wrote: > cgroups v1 has a unique way of setting up memory pressure notifications: > the user opens "memory.pressure_level" of the cgroup they want to > monitor for pressure, then open "cgroup.event_control" and write the fd > (among other things) to that file. memory.pressure_level has no other > use, specifically it does not support any read or write operations. > Consequently, no handlers are provided, and the file ends up with > permissions 000. However, to actually use the mechanism, the subscribing > user must have read access to the file and open the fd for reading, see > memcg_write_event_control(). > > This is all fine as long as the subscribing process runs as root and is > otherwise unconfined by further restrictions. However, if you add strict > access controls such as selinux, the permission bits will be enforced, > and opening memory.pressure_level for reading will fail, preventing the > process from subscribing, even as root. > > > There are several ways around this issue, but adding a dummy read > handler seems like the least invasive to me. I was struggling to see how that addresses the problem because all you need is a read permission. But then I've looked into cgroup code and learned that permissions are constructed based on available callbacks (cgroup_file_mode). This would have made the review easier ;) I have no issue with the patch. It would be great to hear from cgroup maintainers whether a concept of default permissions is something that would be useful also for other files. > I'd be interested to hear: > (a) do you think there is a less invasive way? Alternatively, we could > add a flag in cftype in include/linux/cgroup-defs.h, but that seems > more invasive for what is a legacy interface. > (b) would you be interested to take this patch, or is it too niche a fix > for a legacy subsystem? After you add your s-o-b, feel free to add Acked-by: Michal Hocko <mhocko@xxxxxxxx> If cgroup people find a concept of default permissions for a cgroup file sound then this could be replaced by that approach but this is really an easy workaround. > --- > mm/memcontrol.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 5abffe6f8389..e48c749d9724 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3734,6 +3734,16 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, > } > } > > +/* > + * This function doesn't do anything useful. Its only job is to provide a read > + * handler so that the file gets read permissions when it's created. I would just reference cgroup_file_mode() in the comment to make our lifes easier and comment more helpful. > + */ > +static int mem_cgroup_dummy_seq_show(__always_unused struct seq_file *m, > + __always_unused void *v) > +{ > + return -EINVAL; > +} > + > #ifdef CONFIG_MEMCG_KMEM > static int memcg_online_kmem(struct mem_cgroup *memcg) > { > @@ -5064,6 +5074,7 @@ static struct cftype mem_cgroup_legacy_files[] = { > }, > { > .name = "pressure_level", > + .seq_show = mem_cgroup_dummy_seq_show, > }, > #ifdef CONFIG_NUMA > { > -- > 2.32.0 -- Michal Hocko SUSE Labs