On 2024/5/25 00:03, Michal Koutný wrote: > On Fri, May 17, 2024 at 03:14:23PM GMT, Sam Sun <samsun1006219@xxxxxxxxx> wrote: >> ... >> We analyzed the root cause of this problem. It happens when >> concurrently accessing >> "/sys/fs/cgroup/sys-fs-fuse-connections.mount/irq.pressure" and >> "/sys/fs/cgroup/sys-fs-fuse-connections.mount/cgroup.pressure". If we >> echo 0 to cgroup.pressure, kernel will invoke cgroup_pressure_write(), >> and call kernfs_show(). It will set kn->flags to KERNFS_HIDDEN and >> call kernfs_drain(), in which it frees kernfs_open_file *of. On the >> other side, when accessing irq.pressure, kernel calls >> pressure_write(), which will access of->priv. So that it triggers a >> use-after-free. > > Thanks for the nice breakdown. > > What would you tell to something like below (not tested). Thanks for the detailed report analysis and this fix patch. I can still reproduce the UAF problem with this patch by running: terminal 1: while true; do echo "some 150000 1000000" > cpu.pressure; done terminal 2: while true; do echo 1 > cgroup.pressure; echo 0 > cgroup.pressure; done Because we still access kernfs_open_file in pressure_write() before cgroup_mutex taken. It seems like a problem with kernfs_drain()? I think it should make sure no active users of kernfs_open_file when it returns, right? Will take a look again. Thanks. > > Regards, > Michal > > -- >8 -- > From f159b20051a921bcf990a4488ca6d49382b61a01 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny@xxxxxxxx> > Date: Fri, 24 May 2024 16:50:24 +0200 > Subject: [PATCH] cgroup: Pin appropriate resources when creating PSI pressure > trigger > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Wrongly synchronized access to kernfs_open_file was detected by > syzkaller when there is a race between trigger creation and disabling of > pressure measurements for a cgroup (echo 0 >cgroup.pressure). > > Use cgroup_mutex to synchronize whole duration of pressure_write() to > prevent working with a free'd kernfs_open_file by excluding concurrent > cgroup_pressure_write() (uses cgroup_mutex already). > > Fixes: 0e94682b73bf ("psi: introduce psi monitor") > Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface") > Reported-by: Yue Sun <samsun1006219@xxxxxxxxx> > Reported-by: xingwei lee <xrivendell7@xxxxxxxxx> > Signed-off-by: Michal Koutný <mkoutny@xxxxxxxx> > --- > kernel/cgroup/cgroup.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index e32b6972c478..e16ebd0c4977 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -3777,31 +3777,30 @@ static ssize_t pressure_write(struct kernfs_open_file *of, char *buf, > struct psi_trigger *new; > struct cgroup *cgrp; > struct psi_group *psi; > + ssize_t ret = nbytes; > > cgrp = cgroup_kn_lock_live(of->kn, false); > if (!cgrp) > return -ENODEV; > > - cgroup_get(cgrp); > - cgroup_kn_unlock(of->kn); > - > /* Allow only one trigger per file descriptor */ > if (ctx->psi.trigger) { > - cgroup_put(cgrp); > - return -EBUSY; > + ret = -EBUSY; > + goto out; > } > > psi = cgroup_psi(cgrp); > new = psi_trigger_create(psi, buf, res, of->file, of); > if (IS_ERR(new)) { > - cgroup_put(cgrp); > - return PTR_ERR(new); > + ret = PTR_ERR(new); > + goto out; > } > > smp_store_release(&ctx->psi.trigger, new); > - cgroup_put(cgrp); > > - return nbytes; > +out: > + cgroup_kn_unlock(of->kn); > + return ret; > } > > static ssize_t cgroup_io_pressure_write(struct kernfs_open_file *of,