On Sat, Jun 11, 2022 at 09:33:39PM +0100, Matthew Wilcox wrote: > On Sat, Jun 11, 2022 at 01:07:49PM -0700, Andrew Morton wrote: > > > > Hi. Do you recall if this is a new one? > > New to me. Does this happen reliably enough to do a bisect? > > Hmm. > > folio_lock(folio); > VM_BUG_ON_FOLIO(!folio_contains(folio, index), folio); > if (folio->mapping != mapping) { > folio_unlock(folio); > continue; > } > > so we found a truncated folio (below dump shows folio->mapping == NULL). > We should be able to solve this by simply moving the VM_BUG_ON_FOLIO > down four lines. > > I'm a little confused that this can happen; the page cache is littered > with comments saying: > > /* Leave page->index set: truncation lookup relies upon it */ > > so the VM_BUG_ON_FOLIO shouldn't need to be moved and hints at a problem > that I don't understand. I also don't understand that PG_head is set, > and yet dump_page() did not print: > pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n", > (is it possible it was inadvertently omitted from the bug report?) Hi Matthew, I just checked the full console log, didn't find above warning, it just printed: [ 8590.600151] page:0000000027772b07 refcount:2 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x2a7a00 As the current code logic(mm/debug.c::): pr_warn("page:%p refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n", page, page_ref_count(head), mapcount, mapping, page_to_pgoff(page), page_to_pfn(page)); if (compound) { pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n", head, compound_order(head), folio_entire_mapcount(folio), head_compound_pincount(head)); } The 1st warn as above, but didn't get the 2nd warn. Might due to the compound==false? This bug isn't easy to reproduce, I just hit it once on x86_64, I'm still trying to reproduce it. JFYI, another [mm] bug [1] is 100% reproducible for me. Thanks, Zorro [1] kernel BUG at mm/usercopy.c:101! https://bugzilla.kernel.org/show_bug.cgi?id=216073 > > Also a head page should not be able to have an odd index. So there's a > lot here that doesn't make sense to me right now. > > > Begin forwarded message: > > > > Date: Sat, 11 Jun 2022 10:50:53 +0000 > > From: bugzilla-daemon@xxxxxxxxxx > > To: akpm@xxxxxxxxxxxxxxxxxxxx > > Subject: [Bug 216114] New: page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index)) and kernel BUG at mm/truncate.c:669! > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216114 > > > > Bug ID: 216114 > > Summary: page dumped because: > > VM_BUG_ON_FOLIO(!folio_contains(folio, index)) and > > kernel BUG at mm/truncate.c:669! > > Product: Memory Management > > Version: 2.5 > > Kernel Version: 5.19-rc1 > > Hardware: All > > OS: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Other > > Assignee: akpm@xxxxxxxxxxxxxxxxxxxx > > Reporter: zlang@xxxxxxxxxx > > Regression: No > > > > xfstests on x86_64 with 64k directory size (mkfs.xfs -n size=65536) XFS[1] hit > > panic[2]. The kernel HEAD which I used is > > > > commit 874c8ca1e60b2c564a48f7e7acc40d328d5c8733 > > Author: David Howells <dhowells@xxxxxxxxxx> > > Date: Thu Jun 9 21:46:04 2022 +0100 > > > > netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context > > > > > > [1] > > meta-data=/dev/sda4 isize=512 agcount=16, agsize=245696 blks > > = sectsz=512 attr=2, projid32bit=1 > > = crc=1 finobt=1, sparse=1, rmapbt=0 > > = reflink=1 bigtime=1 inobtcount=1 > > data = bsize=4096 blocks=3931136, imaxpct=25 > > = sunit=64 swidth=192 blks > > naming =version 2 bsize=65536 ascii-ci=0, ftype=1 > > log =internal log bsize=4096 blocks=16384, version=2 > > = sectsz=512 sunit=64 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > > > [2] > > # ./scripts/decode_stacktrace.sh vmlinux < crash.log > > [ 8525.364621] run fstests generic/132 at 2022-06-10 17:58:32 > > [ 8529.173644] XFS (sda4): Mounting V5 Filesystem > > [ 8529.338529] XFS (sda4): Ending clean mount > > [ 8531.015050] restraintd[1356]: *** Current Time: Fri Jun 10 17:58:44 2022 > > Localwatchdog at: Sun Jun 12 15:40:44 2022 > > [ 8560.723674] XFS (sda5): Unmounting Filesystem > > [ 8560.984233] XFS (sda4): EXPERIMENTAL online scrub feature in use. Use at > > your own risk! > > [ 8561.787448] XFS (sda4): Unmounting Filesystem > > [ 8562.925361] XFS (sda4): Mounting V5 Filesystem > > [ 8563.101997] XFS (sda4): Ending clean mount > > [ 8563.163581] XFS (sda4): Unmounting Filesystem > > [ 8563.890637] XFS (sda5): Mounting V5 Filesystem > > [ 8564.087515] XFS (sda5): Ending clean mount > > [ 8567.049332] XFS (sda4): Mounting V5 Filesystem > > [ 8567.176278] XFS (sda4): Ending clean mount > > [ 8567.213659] XFS (sda4): Unmounting Filesystem > > [ 8567.476953] XFS (sda5): EXPERIMENTAL online scrub feature in use. Use at > > your own risk! > > [ 8573.888318] XFS (sda5): Unmounting Filesystem > > [ 8575.293213] XFS (sda5): Mounting V5 Filesystem > > [ 8575.545576] XFS (sda5): Ending clean mount > > [ 8575.883979] run fstests generic/133 at 2022-06-10 17:59:23 > > [ 8590.600151] page:0000000027772b07 refcount:2 mapcount:0 > > mapping:0000000000000000 index:0x1 pfn:0x2a7a00 > > [ 8590.601327] flags: > > 0x57ffffc0050000(head|reclaim|node=1|zone=2|lastcpupid=0x1fffff) > > [ 8590.601341] raw: 0057ffffc0050000 0000000000000000 dead000000000122 > > 0000000000000000 > > [ 8590.601345] raw: 0000000000007300 0000000000000000 00000001ffffffff > > 0000000000000000 > > [ 8590.601348] page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, > > index)) > > [ 8590.601416] ------------[ cut here ]------------ > > [ 8590.601417] kernel BUG at mm/truncate.c:669! > > [ 8590.601431] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI > > [ 8590.606841] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 02/06/2014 > > [ 8590.607178] RIP: 0010:invalidate_inode_pages2_range (mm/truncate.c:669 > > (discriminator 1)) > > [ 8590.607924] Code: c0 03 38 d0 7c 08 84 d2 0f 85 aa 06 00 00 41 8b 47 5c 49 > > 39 c6 0f 82 80 fe ff ff 48 c7 c6 a0 3b 55 99 4c 89 ff e8 7e 9f 07 00 <0f> 0b e8 > > 37 ec fd ff 4c 89 ff e8 9f c3 03 > > 00 84 c0 0f 85 2d 02 00 > > All code > > ======== > > 0: c0 03 38 rolb $0x38,(%rbx) > > 3: d0 7c 08 84 sarb -0x7c(%rax,%rcx,1) > > 7: d2 0f rorb %cl,(%rdi) > > 9: 85 aa 06 00 00 41 test %ebp,0x41000006(%rdx) > > f: 8b 47 5c mov 0x5c(%rdi),%eax > > 12: 49 39 c6 cmp %rax,%r14 > > 15: 0f 82 80 fe ff ff jb 0xfffffffffffffe9b > > 1b: 48 c7 c6 a0 3b 55 99 mov $0xffffffff99553ba0,%rsi > > 22: 4c 89 ff mov %r15,%rdi > > 25: e8 7e 9f 07 00 callq 0x79fa8 > > 2a:* 0f 0b ud2 <-- trapping instruction > > 2c: e8 37 ec fd ff callq 0xfffffffffffdec68 > > 31: 4c 89 ff mov %r15,%rdi > > 34: e8 9f c3 03 00 callq 0x3c3d8 > > 39: 84 c0 test %al,%al > > 3b: 0f .byte 0xf > > 3c: 85 .byte 0x85 > > 3d: 2d .byte 0x2d > > 3e: 02 00 add (%rax),%al > > > > Code starting with the faulting instruction > > =========================================== > > 0: 0f 0b ud2 > > 2: e8 37 ec fd ff callq 0xfffffffffffdec3e > > 7: 4c 89 ff mov %r15,%rdi > > a: e8 9f c3 03 00 callq 0x3c3ae > > f: 84 c0 test %al,%al > > 11: 0f .byte 0xf > > 12: 85 .byte 0x85 > > 13: 2d .byte 0x2d > > 14: 02 00 add (%rax),%al > > [ 8590.609335] RSP: 0018:ffffc9000bd976d0 EFLAGS: 00010286 > > [ 8590.609697] RAX: 0000000000000043 RBX: dffffc0000000000 RCX: > > 0000000000000000 > > [ 8590.610771] RDX: 0000000000000001 RSI: 0000000000000004 RDI: > > fffff520017b2eca > > [ 8590.611576] RBP: 0000000000000000 R08: 0000000000000043 R09: > > ffff8888367efd0b > > [ 8590.612349] R10: ffffed1106cfdfa1 R11: 0000000000000001 R12: > > ffff88825a578418 > > [ 8590.613117] R13: 0000000000007340 R14: 000000000000733f R15: > > ffffea000a9e8000 > > [ 8590.613902] FS: 00007f93f6f56740(0000) GS:ffff888836600000(0000) > > knlGS:0000000000000000 > > [ 8590.614346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 8590.615008] CR2: 00007f0576919150 CR3: 0000000649990000 CR4: > > 00000000000406e0 > > [ 8590.615761] Call Trace: > > [ 8590.615914] <TASK> > > [ 8590.616425] ? mapping_evict_folio.part.0 (mm/truncate.c:630) > > > > [86/196] > > [ 8590.616718] ? pagevec_lookup_range_tag (mm/swap.c:1122) > > [ 8590.617492] ? __filemap_fdatawait_range (mm/filemap.c:518) > > [ 8590.618140] ? xas_reload (mm/filemap.c:503) > > [ 8590.618357] ? filemap_fdatawrite_wbc (./include/linux/backing-dev.h:138 > > mm/filemap.c:383) > > [ 8590.619013] ? filemap_range_has_page (mm/filemap.c:498) > > [ 8590.619866] ? delete_from_page_cache_batch (mm/filemap.c:413) > > [ 8590.620201] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) > > [ 8590.620864] ? filemap_check_errors (./arch/x86/include/asm/bitops.h:207 > > ./include/asm-generic/bitops/instrumented-non-atomic.h:135 mm/filemap.c:351) > > [ 8590.621542] __iomap_dio_rw (fs/iomap/direct-io.c:582) > > [ 8590.621768] ? iomap_dio_bio_iter (fs/iomap/direct-io.c:487) > > [ 8590.622436] ? trace_xfs_setattr (fs/xfs/xfs_iops.c:1020) xfs > > [ 8590.623386] ? iu[ 8590.708322] iomap_dio_rw (fs/iomap/direct-io.c:689) > > [ 8590.724063] xfs_file_dio_write_aligned (fs/xfs/xfs_file.c:536) xfs > > [ 8590.724539] ? xfs_file_dio_write_unaligned (fs/xfs/xfs_file.c:515) xfs > > [ 8590.725586] xfs_file_write_iter (fs/xfs/xfs_file.c:792) xfs > > [ 8590.726382] new_sync_write (fs/read_write.c:505 (discriminator 1)) > > [ 8590.726593] ? new_sync_read (fs/read_write.c:494) > > [ 8590.726801] ? lock_acquire (kernel/locking/lockdep.c:466 > > kernel/locking/lockdep.c:5667 kernel/locking/lockdep.c:5630) > > [ 8590.727040] ? rcu_read_unlock (./include/linux/rcupdate.h:724 (discriminator > > 5)) > > [ 8590.727267] vfs_write (fs/read_write.c:591) > > [ 8590.727815] __x64_sys_pwrite64 (fs/read_write.c:706 fs/read_write.c:716 > > fs/read_write.c:713 fs/read_write.c:713) > > [ 8590.728051] ? vfs_write (fs/read_write.c:713) > > [ 8590.728263] ? ktime_get_coarse_real_ts64 (./include/linux/seqlock.h:104 > > kernel/time/timekeeping.c:2258) > > [ 8590.728546] do_syscall_64 (arch/x86/entry/common.c:50 > > arch/x86/entry/common.c:80) > > [ 8590.728756] ? do_syscall_64 (arch/x86/entry/common.c:87) > > [ 8590.728983] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) > > [ 8590.729600] ? do_syscall_64 (arch/x86/entry/common.c:87) > > [ 8590.729833] ? do_syscall_64+0x69/[ 8591.130203] ? lockdep_hardirqs_on > > (kernel/locking/lockdep.c:4383) > > [ 8591.130866] ? do_syscall_64 (arch/x86/entry/common.c:87) > > [ 8591.131115] ? do_syscall_64 (arch/x86/entry/common.c:87) > > [ 8591.131331] ? do_syscall_64 (arch/x86/entry/common.c:87) > > [ 8591.131576] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) > > [ 8591.132207] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115) > > [ 8591.132501] RIP: 0033:0x7f93f6d3cddf > > [ 8591.132737] Code: 08 89 3c 24 48 89 4c 24 18 e8 6d fe f5 ff 4c 8b 54 24 18 > > 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 12 00 00 00 0f 05 <48> 3d 00 > > f0 ff ff 77 31 44 89 c7 48 89 04 > > 24 e8 bd fe f5 ff 48 8b > > All code > > ======== > > 0: 08 89 3c 24 48 89 or %cl,-0x76b7dbc4(%rcx) > > 6: 4c 24 18 rex.WR and $0x18,%al > > 9: e8 6d fe f5 ff callq 0xfffffffffff5fe7b > > e: 4c 8b 54 24 18 mov 0x18(%rsp),%r10 > > 13: 48 8b 54 24 10 mov 0x10(%rsp),%rdx > > 18: 41 89 c0 mov %eax,%r8d > > 1b: 48 8b 74 24 08 mov 0x8(%rsp),%rsi > > 20: 8b 3c 24 mov (%rsp),%edi > > 23: b8 12 00 00 00 mov $0x12,%eax > > 28: 0f 05 syscall > > 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- > > trapping instruction > > 30: 77 31 ja 0x63 > > 32: 44 89 c7 mov %r8d,%edi > > 35: 48 89 04 24 mov %rax,(%rsp) > > 39: e8 bd fe f5 ff callq 0xfffffffffff5fefb > > 3e: 48 rex.W > > 3f: 8b .byte 0x8b > > > > Code starting with the faulting instruction > > =========================================== > > 0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax > > 6: 77 31 ja 0x39 > > 8: 44 89 c7 mov %r8d,%edi > > b: 48 89 04 24 mov %rax,(%rsp) > > f: e8 bd fe f5 ff callq 0xfffffffffff5fed1 > > 14: 48 rex.W > > 15: 8b .byte 0x8b > > [ 8591.134056] RSP: 002b:00007ffd8aac5a20 EFLAGS: 00000293 ORIG_RAX: > > 0000000000000012 > > [ 8591.134828] RAX: ffffffffffffffda RBX: 0000000007340000 RCX: > > 00007f93f6d3cddf > > [ 8591.135648] RDX: 0000000000010000 RSI: 0000000000b63000 RDI: > > 0000000000000003 > > [ 8591.136425] RBP: 00000000ffffffff R08: 0000000000000000 R09: > > 0000000000000079 > > [ 8591.137179] R10: 0000000007340000 R11: 0000000000000293 R12: > > 0000000007340000 > > [ 8591.138020] R13: 0000000000000000 R14: 0000000000000734 R15: > > 0000000018cc0000 > > [ 8591.138804] </TASK> > > [ 8591.138985] Modules limi sysimgblt fb_sys_fops hpilo ipmi_si ipmi_devintf > > ipmi_msghandler sunrpc acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi > > crc64_rocksoft_generic crc64_rocksoft > > crc64 sr_mod cdrom sg crct10dif_pclmul crc32_pclmul crc32c_intel ahci > > ata_generic libahci ghash_clmulni_intel serio_raw libata hpsa tg3 > > scsi_transport_sas hpwdt [last unloaded: scsi_debug] > > [ 8591.641707] ---[ end trace 0000000000000000 ]--- > > [ 8591.644539] amd_iommu_report_page_fault: 501 callbacks suppressed > > [ 8591.644554] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x000d address=0xbde0e000 flags=0x0000] > > [ 8591.650728] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x000d address=0xbde0e000 flags=0x0000] > > [ 8591.655235] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x000d address=0xbde0e000 flags=0x0000] > > [ 8591.661240] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x000d address=0xbde0e000 flags=0x0000] > > [ 8591.666237] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x000d address=0xbde0e000 flags=0x0000] > > [ 8591.670741] hpilo 0000:02:00.2: AMD-ViIP: 0010:invalidate_inode_pages2_range > > (mm/truncate.c:669 (discriminator 1)) > > [ 8591.676251] Code: c0 03 38 d0 7c 08 84 d2 0f 85 aa 06 00 00 41 8b 47 5c 49 > > 39 c6 0f 82 80 fe ff ff 48 c7 c6 a0 3b 55 99 4c 89 ff e8 7e 9f 07 00 <0f> 0b e8 > > 37 ec fd ff 4c 89 ff e8 9f c3 03 > > 00 84 c0 0f 85 2d 02 00 > > All code > > ======== > > 0: c0 03 38 rolb $0x38,(%rbx) > > 3: d0 7c 08 84 sarb -0x7c(%rax,%rcx,1) > > 7: d2 0f rorb %cl,(%rdi) > > 9: 85 aa 06 00 00 41 test %ebp,0x41000006(%rdx) > > f: 8b 47 5c mov 0x5c(%rdi),%eax > > 12: 49 39 c6 cmp %rax,%r14 > > 15: 0f 82 80 fe ff ff jb 0xfffffffffffffe9b > > 1b: 48 c7 c6 a0 3b 55 99 mov $0xffffffff99553ba0,%rsi > > 22: 4c 89 ff mov %r15,%rdi > > 25: e8 7e 9f 07 00 callq 0x79fa8 > > 2a:* 0f 0b ud2 <-- trapping instruction > > 2c: e8 37 ec fd ff callq 0xfffffffffffdec68 > > 31: 4c 89 ff mov %r15,%rdi > > 34: e8 9f c3 03 00 callq 0x3c3d8 > > 39: 84 c0 test %al,%al > > 3b: 0f .byte 0xf > > 3c: 85 .byte 0x85 > > 3d: 2d .byte 0x2d > > 3e: 02 00 add (%rax),%al > > > > Code starting with the faulting instruction > > =========================================== > > 0: 0f 0b ud2 > > 2: e8 37 ec fd ff callq 0xfffffffffffdec3e > > 7: 4c 89 ff mov %r15,%rdi > > a: e8 9f c3 03 00 callq 0x3c3ae > > f: 84 c0 test %al,%al > > 11: 0f .byte 0xf > > 12: 85 .byte 0x85 > > 13: 2d .byte 0x2d > > 14: 02 00 add (%rax),%al > > [ 8591.676261] RSP: 0018:ffffc9000bd976d0 EFLAGS: 00010286 > > [ 8591.676273] RAX: 0000000000000043 RBX: dffffc0000000000 RCX: > > 0000000000000000 > > [ 8591.676279] RDX: 0000000000000001 RSI: 0000000000000004 RDI: > > fffff520017b2eca > > [ 8591.676287] RBP: 0000000000000000 R08: 0000000000000043 R09: > > ffff8888367efd0b > > [ 8591.676293] R10: ffffed1106cfdfa1 R11: 0000000000000001 R12: > > ffff88825a578418 > > [ 8591.676327] R13: 0000000000007340 R14: 000000000000733f R15: > > ffffea000a9e8000 > > [ 8591.676336] FS: 00007f93f6f56740(0000) GS:ffff888836600000(0000) > > knlGS:0000000000000000 > > [ 8591.676344] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 8591.676350] CR2: 00007f0576919150 CR3: 0000000649990000 CR4: > > 00000000000406e0 > > [ 8591.013708] restraintd[1356]: *** Current Time: Fri Jun 10 17:59:44 2022 > > Localwatchdog at: Sun Jun 12 15:40:44 2022 > > [-- MARK -- Fri Jun 10 22:00:00 2022] > > > > -- > > You may reply to this email to add a comment. > > > > You are receiving this mail because: > > You are the assignee for the bug. >