Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback

Qu Wenruo <wqu@xxxxxxxx> · Mon, 25 Nov 2024 11:00:40 +1030

在 2024/11/25 07:56, Matthew Wilcox 写道:
On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:

  __fput+0x5ba/0xa50 fs/file_table.c:458
  task_work_run+0x24f/0x310 kernel/task_work.c:239
  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
  syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
  do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

This is:

         VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);

ie we've called __folio_start_writeback() on a folio which is already
under writeback.

Higher up in the trace, we have the useful information:

  page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
  memcg:ffff888140adc000
  aops:btrfs_aops ino:105 dentry name(?):"file2"
  flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
  raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
  raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
  page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
  page_owner tracks the page as allocated

The interesting part of the page_owner stacktrace is:

   filemap_alloc_folio_noprof+0xdf/0x500
   __filemap_get_folio+0x446/0xbd0
   prepare_one_folio+0xb6/0xa20
   btrfs_buffered_write+0x6bd/0x1150
   btrfs_direct_write+0x52d/0xa30
   btrfs_do_write_iter+0x2a0/0x760
   do_iter_readv_writev+0x600/0x880
   vfs_writev+0x376/0xba0

(ie not very interesting)

Workqueue: btrfs-delalloc btrfs_work_helper
RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
Call Trace:
  <TASK>
  process_one_folio fs/btrfs/extent_io.c:187 [inline]
  __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
  submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
  submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
  run_ordered_work fs/btrfs/async-thread.c:245 [inline]
  btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
  process_one_work kernel/workqueue.c:3229 [inline]

This looks like a race?

process_one_folio() calls
btrfs_folio_clamp_set_writeback calls
btrfs_subpage_set_writeback:

         spin_lock_irqsave(&subpage->lock, flags);
         bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
;
         if (!folio_test_writeback(folio))
                 folio_start_writeback(folio);
         spin_unlock_irqrestore(&subpage->lock, flags);

so somebody else set writeback after we tested for writeback here.

The test VM is using X86_64, thus we won't go into the subpage routine, 
but directly call folio_start_writeback().

One thing that comes to mind is that _usually_ we take folio_lock()
first, then start writeback, then call folio_unlock() and btrfs isn't
doing that here (afaict).  Maybe that's not the source of the bug?

We still hold the folio locked, do submission then unlock.

You can check extent_writepage(), where at the entrance we check if the 
folio is still locked.
Then inside extent_writepage_io() we do the submission, setting the 
folio writeback inside submit_one_sector().
Eventually unlock the folio at the end of extent_writepage(), that's for 
the uncompressed writes.

There are a lot of special handling for async submission (compression), 
but it  still holds the folio locked, do compression and submission, and 
unlock, just all in another thread (this case).

So it looks like something is wrong when transferring the ownership of 
the page cache folios to the compression path, or some not properly 
handled error path.

Unfortunately I'm not really able to reproduce the case using the 
reproducer...

Thanks,
Qu

If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio)
in __folio_start_writeback()?  Or is there somewhere that can't lock the
folio before starting writeback?