On Mon, Oct 28, 2019 at 12:52:09PM -0700, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit: 12d61c69 Add linux-next specific files for 20191024 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=15a0fa97600000 > kernel config: https://syzkaller.appspot.com/x/.config?x=afb75fd8c9fd5ed8 > dashboard link: https://syzkaller.appspot.com/bug?extid=efb9e48b9fbdc49bb34a > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13a63dc4e00000 > > The bug was bisected to: > > commit 9c61acffe2b8833152041f7b6a02d1d0a17fd378 > Author: Song Liu <songliubraving@xxxxxx> > Date: Wed Oct 23 00:24:28 2019 +0000 > > mm,thp: recheck each page before collapsing file THP > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13eb6ec0e00000 > final crash: https://syzkaller.appspot.com/x/report.txt?x=101b6ec0e00000 > console output: https://syzkaller.appspot.com/x/log.txt?x=17eb6ec0e00000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+efb9e48b9fbdc49bb34a@xxxxxxxxxxxxxxxxxxxxxxxxx > Fixes: 9c61acffe2b8 ("mm,thp: recheck each page before collapsing file THP") > > INFO: task khugepaged:1084 blocked for more than 143 seconds. > Not tainted 5.4.0-rc4-next-20191024 #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > khugepaged D27568 1084 2 0x80004000 > Call Trace: > context_switch kernel/sched/core.c:3384 [inline] > __schedule+0x94a/0x1e70 kernel/sched/core.c:4069 > schedule+0xd9/0x260 kernel/sched/core.c:4136 > io_schedule+0x1c/0x70 kernel/sched/core.c:5780 > wait_on_page_bit_common mm/filemap.c:1175 [inline] > __lock_page+0x422/0xab0 mm/filemap.c:1383 > lock_page include/linux/pagemap.h:480 [inline] > mpage_prepare_extent_to_map+0xb3f/0xf90 fs/ext4/inode.c:2668 > ext4_writepages+0xb6a/0x2e70 fs/ext4/inode.c:2866 > ? 0xffffffff81000000 > do_writepages+0xfa/0x2a0 mm/page-writeback.c:2344 > __filemap_fdatawrite_range+0x2bc/0x3b0 mm/filemap.c:421 > __filemap_fdatawrite mm/filemap.c:429 [inline] > filemap_flush+0x24/0x30 mm/filemap.c:456 This is a double locking deadlock. The page lock is already held when we call into filemap_flush() here, and does another lock_page() in write_cache_pages(). To fix it, we have to either initiate flushing before acquiring the page lock, or simply skip over dirty pages. Maybe doing vfs_fsync_range() from the madvise(HUGEPAGE) call isn't a bad idea after all? (I had discussed this with Song off-list before.)