On Thu, Aug 29, 2024 at 10:04:53PM -0700, Shivani Agarwal wrote: > From: Chen Ridong <chenridong@xxxxxxxxxx> > > [ Upstream commit 1be59c97c83ccd67a519d8a49486b3a8a73ca28a ] > > An UAF can happen when /proc/cpuset is read as reported in [1]. > > This can be reproduced by the following methods: > 1.add an mdelay(1000) before acquiring the cgroup_lock In the > cgroup_path_ns function. > 2.$cat /proc/<pid>/cpuset repeatly. > 3.$mount -t cgroup -o cpuset cpuset /sys/fs/cgroup/cpuset/ > $umount /sys/fs/cgroup/cpuset/ repeatly. > > The race that cause this bug can be shown as below: > > (umount) | (cat /proc/<pid>/cpuset) > css_release | proc_cpuset_show > css_release_work_fn | css = task_get_css(tsk, cpuset_cgrp_id); > css_free_rwork_fn | cgroup_path_ns(css->cgroup, ...); > cgroup_destroy_root | mutex_lock(&cgroup_mutex); > rebind_subsystems | > cgroup_free_root | > | // cgrp was freed, UAF > | cgroup_path_ns_locked(cgrp,..); > > When the cpuset is initialized, the root node top_cpuset.css.cgrp > will point to &cgrp_dfl_root.cgrp. In cgroup v1, the mount operation will > allocate cgroup_root, and top_cpuset.css.cgrp will point to the allocated > &cgroup_root.cgrp. When the umount operation is executed, > top_cpuset.css.cgrp will be rebound to &cgrp_dfl_root.cgrp. > > The problem is that when rebinding to cgrp_dfl_root, there are cases > where the cgroup_root allocated by setting up the root for cgroup v1 > is cached. This could lead to a Use-After-Free (UAF) if it is > subsequently freed. The descendant cgroups of cgroup v1 can only be > freed after the css is released. However, the css of the root will never > be released, yet the cgroup_root should be freed when it is unmounted. > This means that obtaining a reference to the css of the root does > not guarantee that css.cgrp->root will not be freed. > > Fix this problem by using rcu_read_lock in proc_cpuset_show(). > As cgroup_root is kfree_rcu after commit d23b5c577715 > ("cgroup: Make operations on the cgroup root_list RCU safe"), > css->cgroup won't be freed during the critical section. > To call cgroup_path_ns_locked, css_set_lock is needed, so it is safe to > replace task_get_css with task_css. > > [1] https://syzkaller.appspot.com/bug?extid=9b1ff7be974a403aa4cd > > Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces") > Signed-off-by: Chen Ridong <chenridong@xxxxxxxxxx> > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > Signed-off-by: Shivani Agarwal <shivani.agarwal@xxxxxxxxxxxx> > --- > kernel/cgroup/cpuset.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) Now queued up, thanks. greg k-h