On Mon, Apr 15, 2024 at 5:03 PM Jan Kara <jack@xxxxxxx> wrote: > > On Sat 13-04-24 12:32:32, Amir Goldstein wrote: > > On Sat, Apr 13, 2024 at 11:45 AM Hillf Danton <hdanton@xxxxxxxx> wrote: > > > On Fri, 12 Apr 2024 23:42:19 -0700 Amir Goldstein > > > > On Sat, Apr 13, 2024 at 4:41=E2=80=AFAM Hillf Danton <hdanton@xxxxxxxx> wrote: > > > > > On Thu, 11 Apr 2024 01:11:20 -0700 > > > > > > syzbot found the following issue on: > > > > > > > > > > > > HEAD commit: 6ebf211bb11d Add linux-next specific files for 20240410 > > > > > > git tree: linux-next > > > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=3D1621af9d180000 > > > > > > > > > > #syz test https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 6ebf211bb11d > > > > > > > > > > --- x/fs/notify/fsnotify.c > > > > > +++ y/fs/notify/fsnotify.c > > > > > @@ -101,8 +101,8 @@ void fsnotify_sb_delete(struct super_blo > > > > > wait_var_event(fsnotify_sb_watched_objects(sb), > > > > > !atomic_long_read(fsnotify_sb_watched_objects(sb))); > > > > > WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_CONTENT)); > > > > > - WARN_ON(fsnotify_sb_has_priority_watchers(sb, > > > > > - FSNOTIFY_PRIO_PRE_CONTENT)); > > > > > + WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_PRE_CONTENT)); > > > > > + synchronize_srcu(&fsnotify_mark_srcu); > > > > > kfree(sbinfo); > > > > > } > > > > > > > > > > @@ -499,7 +499,7 @@ int fsnotify(__u32 mask, const void *dat > > > > > { > > > > > const struct path *path =3D fsnotify_data_path(data, data_type); > > > > > struct super_block *sb =3D fsnotify_data_sb(data, data_type); > > > > > - struct fsnotify_sb_info *sbinfo =3D fsnotify_sb_info(sb); > > > > > + struct fsnotify_sb_info *sbinfo; > > > > > struct fsnotify_iter_info iter_info = {}; > > > > > struct mount *mnt =3D NULL; > > > > > struct inode *inode2 =3D NULL; > > > > > @@ -529,6 +529,8 @@ int fsnotify(__u32 mask, const void *dat > > > > > inode2_type =3D FSNOTIFY_ITER_TYPE_PARENT; > > > > > } > > > > > > > > > > + iter_info.srcu_idx =3D srcu_read_lock(&fsnotify_mark_srcu); > > > > > + sbinfo =3D fsnotify_sb_info(sb); > > > > > /* > > > > > * Optimization: srcu_read_lock() has a memory barrier which can > > > > > * be expensive. It protects walking the *_fsnotify_marks lists. > > > > > > > > > > > > See comment above. This kills the optimization. > > > > It is not worth letting all the fsnotify hooks suffer the consequence > > > > for the edge case of calling fsnotify hook during fs shutdown. > > > > > > Say nothing before reading your fix. > > > > > > > > Also, fsnotify_sb_info(sb) in fsnotify_sb_has_priority_watchers() > > > > is also not protected and using srcu_read_lock() there completely > > > > nullifies the purpose of fsnotify_sb_info. > > > > > > > > Here is a simplified fix for fsnotify_sb_error() rebased on the > > > > pending mm fixes for this syzbot boot failure: > > > > > > > > #syz test: https://github.com/amir73il/linux fsnotify-fixes > > > > > > Feel free to post your patch at lore because not everyone has > > > access to sites like github. > > > > > > > > Jan, > > > > > > > > I think that all the functions called from fs shutdown context > > > > should observe that SB_ACTIVE is cleared but wasn't sure? > > > > > > If you composed fix based on SB_ACTIVE that is cleared in > > > generic_shutdown_super() with &sb->s_umount held for write, > > > I wonder what simpler serialization than srcu you could > > > find/create in fsnotify. > > > > As far as I can tell there is no need for serialisation. > > > > The problem is that fsnotify_sb_error() can be called from the > > context of ->put_super() call from generic_shutdown_super(). > > > > It's true that in the repro the thread calling fsnotify_sb_error() > > in the worker thread running quota deferred work from put_super() > > but I think there are sufficient barriers for this worker thread to > > observer the cleared SB_ACTIVE flag. > > > > Anyway, according to syzbot, repro does not trigger the UAF > > with my last fix. > > > > To be clear, any fsnotify_sb_error() that is a result of a user operation > > would be holding an active reference to sb so cannot race with > > fsnotify_sb_delete(), but I am not sure that same is true for ext4 > > worker threads. > > > > Jan, > > > > You wrote that "In theory these two calls can even run in parallel > > and fsnotify() can be holding fsnotify_sb_info pointer while > > fsnotify_sb_delete() is freeing". > > > > Can you give an example of this case? > > Yeah, basically what Hilf writes: > > Task 1 Task 2 > umount() some delayed work, transaction > commit, whatever is still running > before ext4_put_super() completes > ... ext4_error() > fsnotify_sb_error() > fsnotify() > fetches fsnotify_sb_info > generic_shutdown_super() > fsnotify_sb_delete() > frees fsnotify_sb_info OK, so what do you say about Hillf's fix patch? Maybe it is ok to let go of the optimization in fsnotify(), considering that we now have stronger optimizations in the inline hooks and in __fsnotify_parent()? I think that Hillf's patch is missing setting s_fsnotify_info to NULL? @@ -101,8 +101,8 @@ void fsnotify_sb_delete(struct super_blo wait_var_event(fsnotify_sb_watched_objects(sb), !atomic_long_read(fsnotify_sb_watched_objects(sb))); WARN_ON(fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_CONTENT)); + WRITE_ONCE(sb->s_fsnotify_info, NULL); + synchronize_srcu(&fsnotify_mark_srcu); kfree(sbinfo); } Thanks, Amir.