On 09/03, Chao Yu wrote: > On 2024/9/2 21:01, Julian Sun wrote: > > On Mon, 2024-09-02 at 16:13 +0800, Chao Yu wrote: > > > > On 2024/8/29 0:54, Julian Sun wrote: > > > > > > Hi, all. > > > > > > > > > > > > Recently syzbot reported a bug as following: > > > > > > > > > > > > kernel BUG at fs/f2fs/inode.c:896! > > > > > > CPU: 1 UID: 0 PID: 5217 Comm: syz-executor605 Not tainted > > > > > > 6.11.0-rc4-syzkaller-00033-g872cf28b8df9 #0 > > > > > > RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896 > > > > > > Call Trace: > > > > > > <TASK> > > > > > > evict+0x532/0x950 fs/inode.c:704 > > > > > > dispose_list fs/inode.c:747 [inline] > > > > > > evict_inodes+0x5f9/0x690 fs/inode.c:797 > > > > > > generic_shutdown_super+0x9d/0x2d0 fs/super.c:627 > > > > > > kill_block_super+0x44/0x90 fs/super.c:1696 > > > > > > kill_f2fs_super+0x344/0x690 fs/f2fs/super.c:4898 > > > > > > deactivate_locked_super+0xc4/0x130 fs/super.c:473 > > > > > > cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373 > > > > > > task_work_run+0x24f/0x310 kernel/task_work.c:228 > > > > > > ptrace_notify+0x2d2/0x380 kernel/signal.c:2402 > > > > > > ptrace_report_syscall include/linux/ptrace.h:415 [inline] > > > > > > ptrace_report_syscall_exit include/linux/ptrace.h:477 > > > > > > [inline] > > > > > > syscall_exit_work+0xc6/0x190 kernel/entry/common.c:173 > > > > > > syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 > > > > > > [inline] > > > > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:205 > > > > > > [inline] > > > > > > syscall_exit_to_user_mode+0x279/0x370 > > > > > > kernel/entry/common.c:218 > > > > > > do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 > > > > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > > > > > > > The syzbot constructed the following scenario: concurrently > > > > > > creating directories and setting the file system to read-only. > > > > > > In this case, while f2fs was making dir, the filesystem > > > > > > switched to > > > > > > readonly, and when it tried to clear the dirty flag, it > > > > > > triggered > > Go back to the root cause, I have no idea why it can leave dirty inode > while mkdir races w/ readonly remount, due to the two operations should > be exclusive, IIUC. Wait, we can think of writable disk mounted as fs-readonly. In that case, IIRC, we allow to recover files/data by roll-forward and so on, which can make some dirty inodes. Can we check if there's any missing path which does not flush dirty inode? > > - mkdir > - do_mkdirat > - filename_create > - mnt_want_write > - mnt_get_write_access > - mount > - do_remount > - reconfigure_super > - sb_prepare_remount_readonly > - mnt_hold_writers > - vfs_mkdir > - f2fs_mkdir > > But when I try to reproduce this bug w/ reproducer provided by syzbot, > I have found a clue in the log: > > "skip recovering inline_dots inode..." > > So I doubt the root cause is __recover_dot_dentries() in f2fs_lookup() > generates dirty data/meta, in this path, we will not grab related lock > to exclude readonly remount. > > Let me try to verify below patch: > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/commit/?h=wip&id=69dc8fbbbb39f85f9f436ca562c98afbcc2a48d2 > > Thanks, > > > > > > > this > > > > > > code path: f2fs_mkdir()-> f2fs_sync_fs()- > > > > > > > f2fs_write_checkpoint() > > > > > > ->f2fs_readonly(). This resulted FI_DIRTY_INODE flag not being > > > > > > cleared, > > > > > > which eventually led to a bug being triggered during the > > > > > > FI_DIRTY_INODE > > > > > > check in f2fs_evict_inode(). > > > > > > > > > > > > In this case, we cannot do anything further, so if filesystem > > > > > > is > > > > > > readonly, > > > > > > do not trigger the BUG. Instead, clean up resources to the best > > > > > > of > > > > > > our > > > > > > ability to prevent triggering subsequent resource leak checks. > > > > > > > > > > > > If there is anything important I'm missing, please let me know, > > > > > > thanks. > > > > > > > > > > > > Reported-by: > > > > > > syzbot+ebea2790904673d7c618@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > Closes: > > > > > > https://syzkaller.appspot.com/bug?extid=ebea2790904673d7c618 > > > > > > Fixes: ca7d802a7d8e ("f2fs: detect dirty inode in evict_inode") > > > > > > CC: stable@xxxxxxxxxxxxxxx > > > > > > Signed-off-by: Julian Sun <sunjunchao2870@xxxxxxxxx> > > > > > > --- > > > > > > fs/f2fs/inode.c | 3 ++- > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c > > > > > > index aef57172014f..ebf825dba0a5 100644 > > > > > > --- a/fs/f2fs/inode.c > > > > > > +++ b/fs/f2fs/inode.c > > > > > > @@ -892,7 +892,8 @@ void f2fs_evict_inode(struct inode *inode) > > > > > > atomic_read(&fi->i_compr_blocks)); > > > > > > if (likely(!f2fs_cp_error(sbi) && > > > > > > - !is_sbi_flag_set(sbi, > > > > > > SBI_CP_DISABLED))) > > > > > > + !is_sbi_flag_set(sbi, > > > > > > SBI_CP_DISABLED)) && > > > > > > + !f2fs_readonly(sbi->sb)) > > > > > > > > Is it fine to drop this dirty inode? Since once it remounts f2fs as > > > > rw one, > > > > previous updates on such inode may be lost? Or am I missing > > > > something? > > > > The purpose of calling this here is mainly to avoid triggering the > > f2fs_bug_on(sbi, 1); statement in the subsequent f2fs_put_super() due > > to a reference count check failure. > > I would say it's possible, but there doesn't seem to be much more we > > can do in this scenario: the inode is about to be freed, and the file > > system is read-only. Or do we need a mechanism to save the inode that > > is about to be freed and then write it back to disk at the appropriate > > time after the file system becomes rw again? But such a mechanism > > sounds somewhat complex and a little bit of weird... Do you have any > > suggestions? > > > > > > > > > > > > Thanks, > > > > > > > > > > f2fs_bug_on(sbi, is_inode_flag_set(inode, > > > > > > FI_DIRTY_INODE)); > > > > > > else > > > > > > f2fs_inode_synced(inode); > > > > > > > > > > Thanks,