Re: Btrfs crash on generic/437 on x86_64

Qu Wenruo <quwenruo.btrfs@xxxxxxx> · Mon, 10 Feb 2025 14:13:53 +1030

在 2025/2/10 14:01, Matthew Wilcox 写道:
On Mon, Feb 10, 2025 at 01:40:16PM +1030, Qu Wenruo wrote:
But this one is a little weird, we got a folio which is still mapped
during filemap_unaccount_folio().

I can reproduce it with default mount option with generic/437, so far 32
runs are enough to trigger it reliably.

And I'm not yet able to reproduce it on aarch64 (64K page size, 4K page
size so far).

I'm already trying to bisect the bug, it so far it's still reproducible
at 6.14-rc1.

Any advice/clue would be appreciated.

Dmesg:

[   58.305921] BTRFS info (device dm-0): using free-space-tree
[   58.319296] run fstests generic/437 at 2025-02-10 13:24:19
[   59.283069] BUG: Bad rss-counter state mm:0000000048578720
type:MM_FILEPAGES val:1

This is the original problem, all else is a consequence.  We're calling
check_mm() in __mmdrop() -- ie we're dropping the last refcount on a
task, and the counters show one page is still mapped.  And it's a file
page.  (now see below for the consequence)

[   59.296485] page: refcount:3 mapcount:1 mapping:00000000828f872f
index:0x0 pfn:0x13ab4f

This folio still has a mapcount of 1.

[   59.297223] memcg:ffff888105a32000
[   59.297533] aops:btrfs_aops [btrfs] ino:1031b
[   59.298188] flags:
0x2ffff800000002d(locked|referenced|uptodate|lru|node=0|zone=2|lastcpupid=0x1ffff)
[   59.298955] raw: 02ffff800000002d ffffea0004184948 ffffea0004c40c88
ffff888107c7a2b8
[   59.299607] raw: 0000000000000000 0000000000000000 0000000300000000
ffff888105a32000
[   59.300261] page dumped because: VM_BUG_ON_FOLIO(folio_mapped(folio))
[   59.300846] ------------[ cut here ]------------
[   59.301256] kernel BUG at mm/filemap.c:154!
[   59.301635] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   59.302144] CPU: 4 UID: 0 PID: 17354 Comm: umount Tainted: G
  OE      6.14.0-rc1-custom+ #211
[   59.302953] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[   59.303447] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
unknown 02/02/2022
[   59.304291] RIP: 0010:filemap_unaccount_folio+0x153/0x1f0
[   59.305224] Code: b0 f0 00 00 00 e9 5d f6 00 00 48 c7 c6 80 1b 43 82
48 89 df e8 ae 89 04 00 0f 0b 48 c7 c6 10 d8 44 82 48 89 df e8 9d 89 04
00 <0f> 0b 48 8b 06 a8 40 74 4c 8b 43 50 e9 ce fe ff ff 48 c7 c6 80 1b
[   59.308807] RSP: 0018:ffffc90005387a18 EFLAGS: 00010046
[   59.309382] RAX: 0000000000000039 RBX: ffffea0004ead3c0 RCX:
0000000000000027
[   59.310313] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffff888277c21880
[   59.311856] RBP: ffff888107c7a2b8 R08: ffffffff82cad0a8 R09:
00000000fffff000
[   59.312879] R10: ffffffff82c55100 R11: 6d75642065676170 R12:
0000000000000001
[   59.313607] R13: ffffffffffffffff R14: ffffc90005387ad8 R15:
ffff888107c7a2c0
[   59.314347] FS:  00007ff0455f2b80(0000) GS:ffff888277c00000(0000)
knlGS:0000000000000000
[   59.315159] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.315744] CR2: 000055e761f94f58 CR3: 0000000166a44000 CR4:
00000000000006f0
[   59.316476] Call Trace:
[   59.316749]  <TASK>
[   59.321408]  delete_from_page_cache_batch+0x95/0x3c0
[   59.321912]  truncate_inode_pages_range+0x142/0x570
[   59.322413]  btrfs_evict_inode+0x8b/0x390 [btrfs]

So we're evicting an inode, and we ask truncate_inode_pages_range()
to get rid of all the folios in the inode's mapping.  It walks the
rmap to find them all ... and doesn't find the one above because it's
exited already.

We need to figure out how we came to not unmap the page from the page
tables originally.  Looking through the merge log of the mm tree, my
suspicion falls on the following patchsets:

        - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
          addresses an issue where "huge" amounts of pte pagetables are
          accumulated:

        - "mm/vma: make more mmap logic userland testable" from Lorenzo
          Stoakes continues the work of moving vma-related code into the
          (relatively) new mm/vma.c

but of course it could be almost anything.

Bisecting now, thankfully v6.13 seems good, so it's just in this merge
window.

Would report back with bisect result and log.

Thanks,
Qu