在 2024/11/25 07:56, Matthew Wilcox 写道:
On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:
__fput+0x5ba/0xa50 fs/file_table.c:458
task_work_run+0x24f/0x310 kernel/task_work.c:239
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
This is:
VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
ie we've called __folio_start_writeback() on a folio which is already
under writeback.
Higher up in the trace, we have the useful information:
page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
memcg:ffff888140adc000
aops:btrfs_aops ino:105 dentry name(?):"file2"
flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
page_owner tracks the page as allocated
The interesting part of the page_owner stacktrace is:
filemap_alloc_folio_noprof+0xdf/0x500
__filemap_get_folio+0x446/0xbd0
prepare_one_folio+0xb6/0xa20
btrfs_buffered_write+0x6bd/0x1150
btrfs_direct_write+0x52d/0xa30
btrfs_do_write_iter+0x2a0/0x760
do_iter_readv_writev+0x600/0x880
vfs_writev+0x376/0xba0
(ie not very interesting)
Workqueue: btrfs-delalloc btrfs_work_helper
RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
Call Trace:
<TASK>
process_one_folio fs/btrfs/extent_io.c:187 [inline]
__process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
run_ordered_work fs/btrfs/async-thread.c:245 [inline]
btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
process_one_work kernel/workqueue.c:3229 [inline]
This looks like a race?
process_one_folio() calls
btrfs_folio_clamp_set_writeback calls
btrfs_subpage_set_writeback:
spin_lock_irqsave(&subpage->lock, flags);
bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
;
if (!folio_test_writeback(folio))
folio_start_writeback(folio);
spin_unlock_irqrestore(&subpage->lock, flags);
so somebody else set writeback after we tested for writeback here.
The test VM is using X86_64, thus we won't go into the subpage routine,
but directly call folio_start_writeback().
One thing that comes to mind is that _usually_ we take folio_lock()
first, then start writeback, then call folio_unlock() and btrfs isn't
doing that here (afaict). Maybe that's not the source of the bug?
We still hold the folio locked, do submission then unlock.
You can check extent_writepage(), where at the entrance we check if the
folio is still locked.
Then inside extent_writepage_io() we do the submission, setting the
folio writeback inside submit_one_sector().
Eventually unlock the folio at the end of extent_writepage(), that's for
the uncompressed writes.
There are a lot of special handling for async submission (compression),
but it still holds the folio locked, do compression and submission, and
unlock, just all in another thread (this case).
So it looks like something is wrong when transferring the ownership of
the page cache folios to the compression path, or some not properly
handled error path.
Unfortunately I'm not really able to reproduce the case using the
reproducer...
Thanks,
Qu
If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio)
in __folio_start_writeback()? Or is there somewhere that can't lock the
folio before starting writeback?