Re: Fw: [Bug 216114] New: page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index)) and kernel BUG at mm/truncate.c:669!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 11, 2022 at 09:33:39PM +0100, Matthew Wilcox wrote:
> On Sat, Jun 11, 2022 at 01:07:49PM -0700, Andrew Morton wrote:
> > 
> > Hi.    Do you recall if this is a new one?
> 
> New to me.  Does this happen reliably enough to do a bisect?
> 
> Hmm.
> 
>                         folio_lock(folio);
>                         VM_BUG_ON_FOLIO(!folio_contains(folio, index), folio);
>                         if (folio->mapping != mapping) {
>                                 folio_unlock(folio);
>                                 continue;
>                         }
> 
> so we found a truncated folio (below dump shows folio->mapping == NULL).
> We should be able to solve this by simply moving the VM_BUG_ON_FOLIO
> down four lines.
> 
> I'm a little confused that this can happen; the page cache is littered
> with comments saying:
> 
>         /* Leave page->index set: truncation lookup relies upon it */
> 
> so the VM_BUG_ON_FOLIO shouldn't need to be moved and hints at a problem
> that I don't understand.  I also don't understand that PG_head is set,
> and yet dump_page() did not print:
>                 pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
> (is it possible it was inadvertently omitted from the bug report?)

Hi Matthew,

I just checked the full console log, didn't find above warning, it just printed:

  [ 8590.600151] page:0000000027772b07 refcount:2 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x2a7a00

As the current code logic(mm/debug.c::):

        pr_warn("page:%p refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n",
                        page, page_ref_count(head), mapcount, mapping,
                        page_to_pgoff(page), page_to_pfn(page));
        if (compound) {
                pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
                                head, compound_order(head),
                                folio_entire_mapcount(folio),
                                head_compound_pincount(head));
        }

The 1st warn as above, but didn't get the 2nd warn. Might due to the compound==false?

This bug isn't easy to reproduce, I just hit it once on x86_64, I'm still
trying to reproduce it. JFYI, another [mm] bug [1] is 100% reproducible
for me.

Thanks,
Zorro

[1]
kernel BUG at mm/usercopy.c:101!
https://bugzilla.kernel.org/show_bug.cgi?id=216073

> 
> Also a head page should not be able to have an odd index.  So there's a
> lot here that doesn't make sense to me right now.
> 
> > Begin forwarded message:
> > 
> > Date: Sat, 11 Jun 2022 10:50:53 +0000
> > From: bugzilla-daemon@xxxxxxxxxx
> > To: akpm@xxxxxxxxxxxxxxxxxxxx
> > Subject: [Bug 216114] New: page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index)) and kernel BUG at mm/truncate.c:669!
> > 
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=216114
> > 
> >             Bug ID: 216114
> >            Summary: page dumped because:
> >                     VM_BUG_ON_FOLIO(!folio_contains(folio, index)) and
> >                     kernel BUG at mm/truncate.c:669!
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 5.19-rc1
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Other
> >           Assignee: akpm@xxxxxxxxxxxxxxxxxxxx
> >           Reporter: zlang@xxxxxxxxxx
> >         Regression: No
> > 
> > xfstests on x86_64 with 64k directory size (mkfs.xfs -n size=65536) XFS[1] hit
> > panic[2]. The kernel HEAD which I used is 
> > 
> > commit 874c8ca1e60b2c564a48f7e7acc40d328d5c8733
> > Author: David Howells <dhowells@xxxxxxxxxx>
> > Date:   Thu Jun 9 21:46:04 2022 +0100
> > 
> >     netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
> > 
> > 
> > [1]
> > meta-data=/dev/sda4              isize=512    agcount=16, agsize=245696 blks
> >          =                       sectsz=512   attr=2, projid32bit=1
> >          =                       crc=1        finobt=1, sparse=1, rmapbt=0
> >          =                       reflink=1    bigtime=1 inobtcount=1
> > data     =                       bsize=4096   blocks=3931136, imaxpct=25
> >          =                       sunit=64     swidth=192 blks
> > naming   =version 2              bsize=65536  ascii-ci=0, ftype=1
> > log      =internal log           bsize=4096   blocks=16384, version=2
> >          =                       sectsz=512   sunit=64 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> > 
> > [2]
> > # ./scripts/decode_stacktrace.sh vmlinux < crash.log                            
> > [ 8525.364621] run fstests generic/132 at 2022-06-10 17:58:32                   
> > [ 8529.173644] XFS (sda4): Mounting V5 Filesystem                       
> > [ 8529.338529] XFS (sda4): Ending clean mount                                   
> > [ 8531.015050] restraintd[1356]: *** Current Time: Fri Jun 10 17:58:44 2022 
> > Localwatchdog at: Sun Jun 12 15:40:44 2022
> > [ 8560.723674] XFS (sda5): Unmounting Filesystem
> > [ 8560.984233] XFS (sda4): EXPERIMENTAL online scrub feature in use. Use at
> > your own risk!
> > [ 8561.787448] XFS (sda4): Unmounting Filesystem
> > [ 8562.925361] XFS (sda4): Mounting V5 Filesystem
> > [ 8563.101997] XFS (sda4): Ending clean mount
> > [ 8563.163581] XFS (sda4): Unmounting Filesystem 
> > [ 8563.890637] XFS (sda5): Mounting V5 Filesystem
> > [ 8564.087515] XFS (sda5): Ending clean mount
> > [ 8567.049332] XFS (sda4): Mounting V5 Filesystem
> > [ 8567.176278] XFS (sda4): Ending clean mount
> > [ 8567.213659] XFS (sda4): Unmounting Filesystem         
> > [ 8567.476953] XFS (sda5): EXPERIMENTAL online scrub feature in use. Use at
> > your own risk!
> > [ 8573.888318] XFS (sda5): Unmounting Filesystem
> > [ 8575.293213] XFS (sda5): Mounting V5 Filesystem
> > [ 8575.545576] XFS (sda5): Ending clean mount
> > [ 8575.883979] run fstests generic/133 at 2022-06-10 17:59:23
> > [ 8590.600151] page:0000000027772b07 refcount:2 mapcount:0
> > mapping:0000000000000000 index:0x1 pfn:0x2a7a00
> > [ 8590.601327] flags:
> > 0x57ffffc0050000(head|reclaim|node=1|zone=2|lastcpupid=0x1fffff)
> > [ 8590.601341] raw: 0057ffffc0050000 0000000000000000 dead000000000122
> > 0000000000000000
> > [ 8590.601345] raw: 0000000000007300 0000000000000000 00000001ffffffff
> > 0000000000000000
> > [ 8590.601348] page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio,
> > index))
> > [ 8590.601416] ------------[ cut here ]------------                             
> > [ 8590.601417] kernel BUG at mm/truncate.c:669!                                 
> > [ 8590.601431] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI                
> > [ 8590.606841] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 02/06/2014      
> > [ 8590.607178] RIP: 0010:invalidate_inode_pages2_range (mm/truncate.c:669
> > (discriminator 1)) 
> > [ 8590.607924] Code: c0 03 38 d0 7c 08 84 d2 0f 85 aa 06 00 00 41 8b 47 5c 49
> > 39 c6 0f 82 80 fe ff ff 48 c7 c6 a0 3b 55 99 4c 89 ff e8 7e 9f 07 00 <0f> 0b e8
> > 37 ec fd ff 4c 89 ff e8 9f c3 03 
> > 00 84 c0 0f 85 2d 02 00   
> > All code              
> > ========
> >    0:   c0 03 38                rolb   $0x38,(%rbx)        
> >    3:   d0 7c 08 84             sarb   -0x7c(%rax,%rcx,1)     
> >    7:   d2 0f                   rorb   %cl,(%rdi)
> >    9:   85 aa 06 00 00 41       test   %ebp,0x41000006(%rdx)                    
> >    f:   8b 47 5c                mov    0x5c(%rdi),%eax     
> >   12:   49 39 c6                cmp    %rax,%r14                 
> >   15:   0f 82 80 fe ff ff       jb     0xfffffffffffffe9b           
> >   1b:   48 c7 c6 a0 3b 55 99    mov    $0xffffffff99553ba0,%rsi                 
> >   22:   4c 89 ff                mov    %r15,%rdi
> >   25:   e8 7e 9f 07 00          callq  0x79fa8
> >   2a:*  0f 0b                   ud2             <-- trapping instruction
> >   2c:   e8 37 ec fd ff          callq  0xfffffffffffdec68
> >   31:   4c 89 ff                mov    %r15,%rdi
> >   34:   e8 9f c3 03 00          callq  0x3c3d8
> >   39:   84 c0                   test   %al,%al
> >   3b:   0f                      .byte 0xf
> >   3c:   85                      .byte 0x85
> >   3d:   2d                      .byte 0x2d
> >   3e:   02 00                   add    (%rax),%al
> > 
> > Code starting with the faulting instruction
> > ===========================================
> >    0:   0f 0b                   ud2    
> >    2:   e8 37 ec fd ff          callq  0xfffffffffffdec3e
> >    7:   4c 89 ff                mov    %r15,%rdi
> >    a:   e8 9f c3 03 00          callq  0x3c3ae
> >    f:   84 c0                   test   %al,%al
> >   11:   0f                      .byte 0xf
> >   12:   85                      .byte 0x85
> >   13:   2d                      .byte 0x2d
> >   14:   02 00                   add    (%rax),%al
> > [ 8590.609335] RSP: 0018:ffffc9000bd976d0 EFLAGS: 00010286
> > [ 8590.609697] RAX: 0000000000000043 RBX: dffffc0000000000 RCX:
> > 0000000000000000
> > [ 8590.610771] RDX: 0000000000000001 RSI: 0000000000000004 RDI:
> > fffff520017b2eca
> > [ 8590.611576] RBP: 0000000000000000 R08: 0000000000000043 R09:
> > ffff8888367efd0b
> > [ 8590.612349] R10: ffffed1106cfdfa1 R11: 0000000000000001 R12:
> > ffff88825a578418
> > [ 8590.613117] R13: 0000000000007340 R14: 000000000000733f R15:
> > ffffea000a9e8000
> > [ 8590.613902] FS:  00007f93f6f56740(0000) GS:ffff888836600000(0000)
> > knlGS:0000000000000000
> > [ 8590.614346] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 8590.615008] CR2: 00007f0576919150 CR3: 0000000649990000 CR4:
> > 00000000000406e0
> > [ 8590.615761] Call Trace:
> > [ 8590.615914]  <TASK>
> > [ 8590.616425] ? mapping_evict_folio.part.0 (mm/truncate.c:630)                
> >                                                                                
> >                        [86/196]
> > [ 8590.616718] ? pagevec_lookup_range_tag (mm/swap.c:1122) 
> > [ 8590.617492] ? __filemap_fdatawait_range (mm/filemap.c:518) 
> > [ 8590.618140] ? xas_reload (mm/filemap.c:503)  
> > [ 8590.618357] ? filemap_fdatawrite_wbc (./include/linux/backing-dev.h:138
> > mm/filemap.c:383) 
> > [ 8590.619013] ? filemap_range_has_page (mm/filemap.c:498) 
> > [ 8590.619866] ? delete_from_page_cache_batch (mm/filemap.c:413) 
> > [ 8590.620201] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) 
> > [ 8590.620864] ? filemap_check_errors (./arch/x86/include/asm/bitops.h:207
> > ./include/asm-generic/bitops/instrumented-non-atomic.h:135 mm/filemap.c:351) 
> > [ 8590.621542] __iomap_dio_rw (fs/iomap/direct-io.c:582) 
> > [ 8590.621768] ? iomap_dio_bio_iter (fs/iomap/direct-io.c:487) 
> > [ 8590.622436] ? trace_xfs_setattr (fs/xfs/xfs_iops.c:1020) xfs
> > [ 8590.623386] ? iu[ 8590.708322] iomap_dio_rw (fs/iomap/direct-io.c:689) 
> > [ 8590.724063] xfs_file_dio_write_aligned (fs/xfs/xfs_file.c:536) xfs
> > [ 8590.724539] ? xfs_file_dio_write_unaligned (fs/xfs/xfs_file.c:515) xfs
> > [ 8590.725586] xfs_file_write_iter (fs/xfs/xfs_file.c:792) xfs
> > [ 8590.726382] new_sync_write (fs/read_write.c:505 (discriminator 1)) 
> > [ 8590.726593] ? new_sync_read (fs/read_write.c:494) 
> > [ 8590.726801] ? lock_acquire (kernel/locking/lockdep.c:466
> > kernel/locking/lockdep.c:5667 kernel/locking/lockdep.c:5630) 
> > [ 8590.727040] ? rcu_read_unlock (./include/linux/rcupdate.h:724 (discriminator
> > 5)) 
> > [ 8590.727267] vfs_write (fs/read_write.c:591)  
> > [ 8590.727815] __x64_sys_pwrite64 (fs/read_write.c:706 fs/read_write.c:716
> > fs/read_write.c:713 fs/read_write.c:713) 
> > [ 8590.728051] ? vfs_write (fs/read_write.c:713) 
> > [ 8590.728263] ? ktime_get_coarse_real_ts64 (./include/linux/seqlock.h:104
> > kernel/time/timekeeping.c:2258) 
> > [ 8590.728546] do_syscall_64 (arch/x86/entry/common.c:50
> > arch/x86/entry/common.c:80) 
> > [ 8590.728756] ? do_syscall_64 (arch/x86/entry/common.c:87) 
> > [ 8590.728983] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) 
> > [ 8590.729600] ? do_syscall_64 (arch/x86/entry/common.c:87) 
> > [ 8590.729833] ? do_syscall_64+0x69/[ 8591.130203] ? lockdep_hardirqs_on
> > (kernel/locking/lockdep.c:4383) 
> > [ 8591.130866] ? do_syscall_64 (arch/x86/entry/common.c:87) 
> > [ 8591.131115] ? do_syscall_64 (arch/x86/entry/common.c:87) 
> > [ 8591.131331] ? do_syscall_64 (arch/x86/entry/common.c:87) 
> > [ 8591.131576] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) 
> > [ 8591.132207] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115) 
> > [ 8591.132501] RIP: 0033:0x7f93f6d3cddf
> > [ 8591.132737] Code: 08 89 3c 24 48 89 4c 24 18 e8 6d fe f5 ff 4c 8b 54 24 18
> > 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 12 00 00 00 0f 05 <48> 3d 00
> > f0 ff ff 77 31 44 89 c7 48 89 04 
> > 24 e8 bd fe f5 ff 48 8b
> > All code
> > ========
> >    0:   08 89 3c 24 48 89       or     %cl,-0x76b7dbc4(%rcx)
> >    6:   4c 24 18                rex.WR and $0x18,%al
> >    9:   e8 6d fe f5 ff          callq  0xfffffffffff5fe7b
> >    e:   4c 8b 54 24 18          mov    0x18(%rsp),%r10
> >   13:   48 8b 54 24 10          mov    0x10(%rsp),%rdx
> >   18:   41 89 c0                mov    %eax,%r8d
> >   1b:   48 8b 74 24 08          mov    0x8(%rsp),%rsi
> >   20:   8b 3c 24                mov    (%rsp),%edi
> >   23:   b8 12 00 00 00          mov    $0x12,%eax
> >   28:   0f 05                   syscall 
> >   2a:*  48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax         <--
> > trapping instruction
> >   30:   77 31                   ja     0x63
> >   32:   44 89 c7                mov    %r8d,%edi
> >   35:   48 89 04 24             mov    %rax,(%rsp)
> >   39:   e8 bd fe f5 ff          callq  0xfffffffffff5fefb
> >   3e:   48                      rex.W
> >   3f:   8b                      .byte 0x8b
> > 
> > Code starting with the faulting instruction
> > ===========================================
> >    0:   48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax
> >    6:   77 31                   ja     0x39
> >    8:   44 89 c7                mov    %r8d,%edi
> >    b:   48 89 04 24             mov    %rax,(%rsp)
> >    f:   e8 bd fe f5 ff          callq  0xfffffffffff5fed1
> >   14:   48                      rex.W
> >   15:   8b                      .byte 0x8b
> > [ 8591.134056] RSP: 002b:00007ffd8aac5a20 EFLAGS: 00000293 ORIG_RAX:
> > 0000000000000012
> > [ 8591.134828] RAX: ffffffffffffffda RBX: 0000000007340000 RCX:
> > 00007f93f6d3cddf
> > [ 8591.135648] RDX: 0000000000010000 RSI: 0000000000b63000 RDI:
> > 0000000000000003
> > [ 8591.136425] RBP: 00000000ffffffff R08: 0000000000000000 R09:
> > 0000000000000079
> > [ 8591.137179] R10: 0000000007340000 R11: 0000000000000293 R12:
> > 0000000007340000
> > [ 8591.138020] R13: 0000000000000000 R14: 0000000000000734 R15:
> > 0000000018cc0000
> > [ 8591.138804]  </TASK>
> > [ 8591.138985] Modules limi sysimgblt fb_sys_fops hpilo ipmi_si ipmi_devintf
> > ipmi_msghandler sunrpc acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi
> > crc64_rocksoft_generic crc64_rocksoft
> >  crc64 sr_mod cdrom sg crct10dif_pclmul crc32_pclmul crc32c_intel ahci
> > ata_generic libahci ghash_clmulni_intel serio_raw libata hpsa tg3
> > scsi_transport_sas hpwdt [last unloaded: scsi_debug]
> > [ 8591.641707] ---[ end trace 0000000000000000 ]---
> > [ 8591.644539] amd_iommu_report_page_fault: 501 callbacks suppressed
> > [ 8591.644554] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x000d address=0xbde0e000 flags=0x0000]
> > [ 8591.650728] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x000d address=0xbde0e000 flags=0x0000]
> > [ 8591.655235] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x000d address=0xbde0e000 flags=0x0000]
> > [ 8591.661240] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x000d address=0xbde0e000 flags=0x0000]
> > [ 8591.666237] hpilo 0000:02:00.2: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x000d address=0xbde0e000 flags=0x0000]
> > [ 8591.670741] hpilo 0000:02:00.2: AMD-ViIP: 0010:invalidate_inode_pages2_range
> > (mm/truncate.c:669 (discriminator 1)) 
> > [ 8591.676251] Code: c0 03 38 d0 7c 08 84 d2 0f 85 aa 06 00 00 41 8b 47 5c 49
> > 39 c6 0f 82 80 fe ff ff 48 c7 c6 a0 3b 55 99 4c 89 ff e8 7e 9f 07 00 <0f> 0b e8
> > 37 ec fd ff 4c 89 ff e8 9f c3 03 
> > 00 84 c0 0f 85 2d 02 00
> > All code
> > ========
> >    0:   c0 03 38                rolb   $0x38,(%rbx)
> >    3:   d0 7c 08 84             sarb   -0x7c(%rax,%rcx,1)
> >    7:   d2 0f                   rorb   %cl,(%rdi)
> >    9:   85 aa 06 00 00 41       test   %ebp,0x41000006(%rdx)
> >    f:   8b 47 5c                mov    0x5c(%rdi),%eax
> >   12:   49 39 c6                cmp    %rax,%r14
> >   15:   0f 82 80 fe ff ff       jb     0xfffffffffffffe9b
> >   1b:   48 c7 c6 a0 3b 55 99    mov    $0xffffffff99553ba0,%rsi
> >   22:   4c 89 ff                mov    %r15,%rdi
> >   25:   e8 7e 9f 07 00          callq  0x79fa8
> >   2a:*  0f 0b                   ud2             <-- trapping instruction
> >   2c:   e8 37 ec fd ff          callq  0xfffffffffffdec68
> >   31:   4c 89 ff                mov    %r15,%rdi
> >   34:   e8 9f c3 03 00          callq  0x3c3d8
> >   39:   84 c0                   test   %al,%al
> >   3b:   0f                      .byte 0xf
> >   3c:   85                      .byte 0x85
> >   3d:   2d                      .byte 0x2d
> >   3e:   02 00                   add    (%rax),%al
> > 
> > Code starting with the faulting instruction
> > ===========================================
> >    0:   0f 0b                   ud2    
> >    2:   e8 37 ec fd ff          callq  0xfffffffffffdec3e
> >    7:   4c 89 ff                mov    %r15,%rdi
> >    a:   e8 9f c3 03 00          callq  0x3c3ae
> >    f:   84 c0                   test   %al,%al
> >   11:   0f                      .byte 0xf
> >   12:   85                      .byte 0x85
> >   13:   2d                      .byte 0x2d
> >   14:   02 00                   add    (%rax),%al
> > [ 8591.676261] RSP: 0018:ffffc9000bd976d0 EFLAGS: 00010286
> > [ 8591.676273] RAX: 0000000000000043 RBX: dffffc0000000000 RCX:
> > 0000000000000000
> > [ 8591.676279] RDX: 0000000000000001 RSI: 0000000000000004 RDI:
> > fffff520017b2eca
> > [ 8591.676287] RBP: 0000000000000000 R08: 0000000000000043 R09:
> > ffff8888367efd0b
> > [ 8591.676293] R10: ffffed1106cfdfa1 R11: 0000000000000001 R12:
> > ffff88825a578418
> > [ 8591.676327] R13: 0000000000007340 R14: 000000000000733f R15:
> > ffffea000a9e8000
> > [ 8591.676336] FS:  00007f93f6f56740(0000) GS:ffff888836600000(0000)
> > knlGS:0000000000000000
> > [ 8591.676344] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 8591.676350] CR2: 00007f0576919150 CR3: 0000000649990000 CR4:
> > 00000000000406e0
> > [ 8591.013708] restraintd[1356]: *** Current Time: Fri Jun 10 17:59:44 2022 
> > Localwatchdog at: Sun Jun 12 15:40:44 2022
> > [-- MARK -- Fri Jun 10 22:00:00 2022]
> > 
> > -- 
> > You may reply to this email to add a comment.
> > 
> > You are receiving this mail because:
> > You are the assignee for the bug.
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux