On Tue, Aug 09, 2022 at 10:53:21PM -0700, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 200e340f2196 Merge tag 'pull-work.dcache' of git://git.ker.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=13d08412080000 > kernel config: https://syzkaller.appspot.com/x/.config?x=a3f4d6985d3164cd > dashboard link: https://syzkaller.appspot.com/bug?extid=ed920a72fd23eb735158 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15dd033e080000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16dbfa46080000 > > Bisection is inconclusive: the issue happens on the oldest tested release. tl;dr: Well known problem. Don't do O_DSYNC direct IO writes on vfat. Basically, vfat uses __generic_file_sync() which takes the inode_lock(). It's not valid to take the inode_lock() in DIO completion callbacks as we do for O_DSYNC/O_SYNC writes because setattr needs to do: inode_lock() inode_dio_wait() <waits for inode->i_dio_count to go to zero> to wait for all pending direct IO to drain before it can proceed. Hence: <dio holds inode->i_dio_count reference> dio_complete generic_write_sync vfs_fsync_range fat_file_fsync __generic_file_fsync inode_lock <blocks> O_DSYNC DIO completion will attempt to lock the inode with an elevated inode->i_dio_count (as is always the case when dio_complete() is called) and hence we have a trivial ABBA deadlock vector via truncate, hole punching, etc. > INFO: task kworker/0:1:14 blocked for more than 143 seconds. > Not tainted 5.19.0-syzkaller-02972-g200e340f2196 #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/0:1 state:D stack:26544 pid: 14 ppid: 2 flags:0x00004000 > Workqueue: dio/loop5 dio_aio_complete_work > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5178 [inline] > __schedule+0xa00/0x4c10 kernel/sched/core.c:6490 > schedule+0xda/0x1b0 kernel/sched/core.c:6566 > rwsem_down_write_slowpath+0x697/0x11e0 kernel/locking/rwsem.c:1182 > __down_write_common kernel/locking/rwsem.c:1297 [inline] > __down_write_common kernel/locking/rwsem.c:1294 [inline] > __down_write kernel/locking/rwsem.c:1306 [inline] > down_write+0x135/0x150 kernel/locking/rwsem.c:1553 > inode_lock include/linux/fs.h:760 [inline] > __generic_file_fsync+0xb0/0x1f0 fs/libfs.c:1119 > fat_file_fsync+0x73/0x200 fs/fat/file.c:191 > vfs_fsync_range+0x13a/0x220 fs/sync.c:188 > generic_write_sync include/linux/fs.h:2861 [inline] > dio_complete+0x6dd/0x950 fs/direct-io.c:310 > process_one_work+0x996/0x1610 kernel/workqueue.c:2289 > worker_thread+0x665/0x1080 kernel/workqueue.c:2436 > kthread+0x2e9/0x3a0 kernel/kthread.c:376 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 > </TASK> There's dio completion. > INFO: task syz-executor775:3664 blocked for more than 144 seconds. > Not tainted 5.19.0-syzkaller-02972-g200e340f2196 #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:syz-executor775 state:D stack:26128 pid: 3664 ppid: 3656 flags:0x00004004 > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5178 [inline] > __schedule+0xa00/0x4c10 kernel/sched/core.c:6490 > schedule+0xda/0x1b0 kernel/sched/core.c:6566 > __inode_dio_wait fs/inode.c:2381 [inline] > inode_dio_wait+0x22a/0x270 fs/inode.c:2399 > fat_setattr+0x3de/0x13c0 fs/fat/file.c:509 > notify_change+0xcd0/0x1440 fs/attr.c:418 > do_truncate+0x13c/0x200 fs/open.c:65 > do_sys_ftruncate+0x536/0x730 fs/open.c:193 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd There's truncate waiting on dio completion holding the inode lock. So, as expected, any filesystem that supports DIO and calls into __generic_file_fsync() for fsync functionality can easily deadlock truncate against O_DSYNC DIO writes... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx