On Mon, Apr 15, 2024 at 5:47 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Mon, Apr 15, 2024 at 5:03 PM Jan Kara <jack@xxxxxxx> wrote: > > > > On Sat 13-04-24 12:32:32, Amir Goldstein wrote: > > > On Sat, Apr 13, 2024 at 11:45 AM Hillf Danton <hdanton@xxxxxxxx> wrote: > > > > On Fri, 12 Apr 2024 23:42:19 -0700 Amir Goldstein > > > > > On Sat, Apr 13, 2024 at 4:41=E2=80=AFAM Hillf Danton <hdanton@xxxxxxxx> wrote: > > > > > > On Thu, 11 Apr 2024 01:11:20 -0700 > > > > > > > syzbot found the following issue on: > > > > > > > > > > > > > > HEAD commit: 6ebf211bb11d Add linux-next specific files for 20240410 > > > > > > > git tree: linux-next > > > > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=3D1621af9d180000 > > > > > > > > > > > > #syz test https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 6ebf211bb11d > > > > > > > > > > > > --- x/fs/notify/fsnotify.c > > > > > > +++ y/fs/notify/fsnotify.c > > > > > > @@ -101,8 +101,8 @@ void fsnotify_sb_delete(struct super_blo > > > > > > wait_var_event(fsnotify_sb_watched_objects(sb), > > > > > > !atomic_long_read(fsnotify_sb_watched_objects(sb))); > > > > > > WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_CONTENT)); > > > > > > - WARN_ON(fsnotify_sb_has_priority_watchers(sb, > > > > > > - FSNOTIFY_PRIO_PRE_CONTENT)); > > > > > > + WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_PRE_CONTENT)); > > > > > > + synchronize_srcu(&fsnotify_mark_srcu); > > > > > > kfree(sbinfo); > > > > > > } > > > > > > > > > > > > @@ -499,7 +499,7 @@ int fsnotify(__u32 mask, const void *dat > > > > > > { > > > > > > const struct path *path =3D fsnotify_data_path(data, data_type); > > > > > > struct super_block *sb =3D fsnotify_data_sb(data, data_type); > > > > > > - struct fsnotify_sb_info *sbinfo =3D fsnotify_sb_info(sb); > > > > > > + struct fsnotify_sb_info *sbinfo; > > > > > > struct fsnotify_iter_info iter_info = {}; > > > > > > struct mount *mnt =3D NULL; > > > > > > struct inode *inode2 =3D NULL; > > > > > > @@ -529,6 +529,8 @@ int fsnotify(__u32 mask, const void *dat > > > > > > inode2_type =3D FSNOTIFY_ITER_TYPE_PARENT; > > > > > > } > > > > > > > > > > > > + iter_info.srcu_idx =3D srcu_read_lock(&fsnotify_mark_srcu); > > > > > > + sbinfo =3D fsnotify_sb_info(sb); > > > > > > /* > > > > > > * Optimization: srcu_read_lock() has a memory barrier which can > > > > > > * be expensive. It protects walking the *_fsnotify_marks lists. > > > > > > > > > > > > > > > See comment above. This kills the optimization. > > > > > It is not worth letting all the fsnotify hooks suffer the consequence > > > > > for the edge case of calling fsnotify hook during fs shutdown. > > > > > > > > Say nothing before reading your fix. > > > > > > > > > > Also, fsnotify_sb_info(sb) in fsnotify_sb_has_priority_watchers() > > > > > is also not protected and using srcu_read_lock() there completely > > > > > nullifies the purpose of fsnotify_sb_info. > > > > > > > > > > Here is a simplified fix for fsnotify_sb_error() rebased on the > > > > > pending mm fixes for this syzbot boot failure: > > > > > > > > > > #syz test: https://github.com/amir73il/linux fsnotify-fixes > > > > > > > > Feel free to post your patch at lore because not everyone has > > > > access to sites like github. > > > > > > > > > > Jan, > > > > > > > > > > I think that all the functions called from fs shutdown context > > > > > should observe that SB_ACTIVE is cleared but wasn't sure? > > > > > > > > If you composed fix based on SB_ACTIVE that is cleared in > > > > generic_shutdown_super() with &sb->s_umount held for write, > > > > I wonder what simpler serialization than srcu you could > > > > find/create in fsnotify. > > > > > > As far as I can tell there is no need for serialisation. > > > > > > The problem is that fsnotify_sb_error() can be called from the > > > context of ->put_super() call from generic_shutdown_super(). > > > > > > It's true that in the repro the thread calling fsnotify_sb_error() > > > in the worker thread running quota deferred work from put_super() > > > but I think there are sufficient barriers for this worker thread to > > > observer the cleared SB_ACTIVE flag. > > > > > > Anyway, according to syzbot, repro does not trigger the UAF > > > with my last fix. > > > > > > To be clear, any fsnotify_sb_error() that is a result of a user operation > > > would be holding an active reference to sb so cannot race with > > > fsnotify_sb_delete(), but I am not sure that same is true for ext4 > > > worker threads. > > > > > > Jan, > > > > > > You wrote that "In theory these two calls can even run in parallel > > > and fsnotify() can be holding fsnotify_sb_info pointer while > > > fsnotify_sb_delete() is freeing". > > > > > > Can you give an example of this case? > > > > Yeah, basically what Hilf writes: > > > > Task 1 Task 2 > > umount() some delayed work, transaction > > commit, whatever is still running > > before ext4_put_super() completes > > ... ext4_error() > > fsnotify_sb_error() > > fsnotify() > > fetches fsnotify_sb_info > > generic_shutdown_super() > > fsnotify_sb_delete() > > frees fsnotify_sb_info > > OK, so what do you say about Hillf's fix patch? > > Maybe it is ok to let go of the optimization in fsnotify(), considering > that we now have stronger optimizations in the inline hooks and > in __fsnotify_parent()? > > I think that Hillf's patch is missing setting s_fsnotify_info to NULL? > > @@ -101,8 +101,8 @@ void fsnotify_sb_delete(struct super_blo > wait_var_event(fsnotify_sb_watched_objects(sb), > !atomic_long_read(fsnotify_sb_watched_objects(sb))); > WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_CONTENT)); > + WRITE_ONCE(sb->s_fsnotify_info, NULL); > + synchronize_srcu(&fsnotify_mark_srcu); > kfree(sbinfo); > } > I reworked Hillf's patch so it won't break the optimizations for the common case and added setting s_fsnotify_info to NULL (attached). Let's see what syzbot has to say: #syz test: https://github.com/amir73il/linux fsnotify-fixes Thanks, Amir.
From 4e42e6b66c24e42b4089b1212f0eccb01b7b7eda Mon Sep 17 00:00:00 2001 From: Amir Goldstein <amir73il@xxxxxxxxx> Date: Thu, 11 Apr 2024 18:59:08 +0300 Subject: [PATCH] fsnotify: fix UAF from FS_ERROR event on a shutting down filesystem Protect against use after free when filesystem calls fsnotify_sb_error() during fs shutdown. fsnotify_sb_error() may be called without an s_active reference, so use SRCU to synchronize access to fsnotify_sb_info() in this case. Reported-by: syzbot+5e3f9b2a67b45f16d4e6@xxxxxxxxxxxxxxxxxxxxxxxxx Suggested-by: Hillf Danton <hdanton@xxxxxxxx> Link: https://lore.kernel.org/linux-fsdevel/20240413014033.1722-1-hdanton@xxxxxxxx/ Fixes: 07a3b8d0bf72 ("fsnotify: lazy attach fsnotify_sb_info state to sb") Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> --- fs/notify/fsnotify.c | 14 ++++++++++++-- include/linux/fsnotify_backend.h | 3 +++ 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index 2ae965ef37e8..ec9b535d333a 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -103,6 +103,8 @@ void fsnotify_sb_delete(struct super_block *sb) WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_CONTENT)); WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_PRE_CONTENT)); + WRITE_ONCE(sb->s_fsnotify_info, NULL); + synchronize_srcu(&fsnotify_mark_srcu); kfree(sbinfo); } @@ -506,6 +508,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir, struct dentry *moved; int inode2_type; int ret = 0; + bool sb_active_ref = !(mask & FS_EVENTS_POSS_ON_SHUTDOWN); __u32 test_mask, marks_mask; if (path) @@ -535,8 +538,10 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir, * However, if we do not walk the lists, we do not have to do * SRCU because we have no references to any objects and do not * need SRCU to keep them "alive". + * We only need SRCU to keep sbinfo "alive" for events possible + * during shutdown (e.g. FS_ERROR). */ - if ((!sbinfo || !sbinfo->sb_marks) && + if ((!sbinfo || (sb_active_ref && !sbinfo->sb_marks)) && (!mnt || !mnt->mnt_fsnotify_marks) && (!inode || !inode->i_fsnotify_marks) && (!inode2 || !inode2->i_fsnotify_marks)) @@ -562,7 +567,12 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir, return 0; iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu); - + /* + * For events possible during shutdown (e.g. FS_ERROR) we may not hold + * an s_active reference on sb, so we need to re-fetch sbinfo with + * srcu_read_lock() held. + */ + sbinfo = fsnotify_sb_info(sb); if (sbinfo) { iter_info.marks[FSNOTIFY_ITER_TYPE_SB] = fsnotify_first_mark(&sbinfo->sb_marks); diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h index 7f1ab8264e41..f10987662d1f 100644 --- a/include/linux/fsnotify_backend.h +++ b/include/linux/fsnotify_backend.h @@ -97,6 +97,9 @@ */ #define FS_EVENTS_POSS_TO_PARENT (FS_EVENTS_POSS_ON_CHILD) +/* Events that could be generated while fs is shutting down */ +#define FS_EVENTS_POSS_ON_SHUTDOWN (FS_ERROR) + /* Events that can be reported to backends */ #define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \ FS_EVENTS_POSS_ON_CHILD | \ -- 2.34.1