On Mon, Nov 25, 2024 at 1:30 AM 'Qu Wenruo' via syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote: > > > > 在 2024/11/25 07:56, Matthew Wilcox 写道: > > On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote: > >> > >> __fput+0x5ba/0xa50 fs/file_table.c:458 > >> task_work_run+0x24f/0x310 kernel/task_work.c:239 > >> resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] > >> exit_to_user_mode_loop kernel/entry/common.c:114 [inline] > >> exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] > >> __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] > >> syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218 > >> do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 > >> entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > This is: > > > > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > > > > ie we've called __folio_start_writeback() on a folio which is already > > under writeback. > > > > Higher up in the trace, we have the useful information: > > > > page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5 > > memcg:ffff888140adc000 > > aops:btrfs_aops ino:105 dentry name(?):"file2" > > flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff) > > raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710 > > raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000 > > page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio)) > > page_owner tracks the page as allocated > > > > The interesting part of the page_owner stacktrace is: > > > > filemap_alloc_folio_noprof+0xdf/0x500 > > __filemap_get_folio+0x446/0xbd0 > > prepare_one_folio+0xb6/0xa20 > > btrfs_buffered_write+0x6bd/0x1150 > > btrfs_direct_write+0x52d/0xa30 > > btrfs_do_write_iter+0x2a0/0x760 > > do_iter_readv_writev+0x600/0x880 > > vfs_writev+0x376/0xba0 > > > > (ie not very interesting) > > > >> Workqueue: btrfs-delalloc btrfs_work_helper > >> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119 > >> Call Trace: > >> <TASK> > >> process_one_folio fs/btrfs/extent_io.c:187 [inline] > >> __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216 > >> submit_one_async_extent fs/btrfs/inode.c:1229 [inline] > >> submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632 > >> run_ordered_work fs/btrfs/async-thread.c:245 [inline] > >> btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324 > >> process_one_work kernel/workqueue.c:3229 [inline] > > > > This looks like a race? > > > > process_one_folio() calls > > btrfs_folio_clamp_set_writeback calls > > btrfs_subpage_set_writeback: > > > > spin_lock_irqsave(&subpage->lock, flags); > > bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits) > > ; > > if (!folio_test_writeback(folio)) > > folio_start_writeback(folio); > > spin_unlock_irqrestore(&subpage->lock, flags); > > > > so somebody else set writeback after we tested for writeback here. > > The test VM is using X86_64, thus we won't go into the subpage routine, > but directly call folio_start_writeback(). > > > > > One thing that comes to mind is that _usually_ we take folio_lock() > > first, then start writeback, then call folio_unlock() and btrfs isn't > > doing that here (afaict). Maybe that's not the source of the bug? > > We still hold the folio locked, do submission then unlock. > > You can check extent_writepage(), where at the entrance we check if the > folio is still locked. > Then inside extent_writepage_io() we do the submission, setting the > folio writeback inside submit_one_sector(). > Eventually unlock the folio at the end of extent_writepage(), that's for > the uncompressed writes. > > There are a lot of special handling for async submission (compression), > but it still holds the folio locked, do compression and submission, and > unlock, just all in another thread (this case). > > So it looks like something is wrong when transferring the ownership of > the page cache folios to the compression path, or some not properly > handled error path. > > Unfortunately I'm not really able to reproduce the case using the > reproducer... I've just tried to reproduce locally using the downloadable assets and the kernel crashed ~ after 1 minute of running the attached C repro. [ 87.616440][ T9044] ------------[ cut here ]------------ [ 87.617126][ T9044] kernel BUG at mm/page-writeback.c:3119! [ 87.619308][ T9044] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 87.620174][ T9044] CPU: 1 UID: 0 PID: 9044 Comm: kworker/u10:6 Not tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0 Here are the instructions I followed: https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md#run-a-c-reproducer -- Aleksandr > > Thanks, > Qu > > > > > > > If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio) > > in __folio_start_writeback()? Or is there somewhere that can't lock the > > folio before starting writeback? > > >