Re: [xfstests generic/648] 64k directory block size (-n size=65536) crash on _xfs_buf_ioapply

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, Dec 19, 2023 at 02:34:20PM +0800, Zorro Lang wrote:
> On Mon, Dec 18, 2023 at 09:48:36AM -0800, Darrick J. Wong wrote:
> > On Mon, Dec 18, 2023 at 10:01:34PM +0800, Zorro Lang wrote:
> > > Hi,
> > > 
> > > Recently I hit a crash [1] on s390x with 64k directory block size xfs
> > > (-n size=65536 -m crc=1,finobt=1,reflink=1,rmapbt=0,bigtime=1,inobtcount=1),
> > > even not panic, a assertion failure will happen.
> > > 
> > > I found it from an old downstream kernel at first, then reproduced it
> > > on latest upstream mainline linux (v6.7-rc6). Can't be sure how long
> > > time this issue be there, just reported it at first.
> > > 
> > > I hit and reproduced it several times by running generic/648 on s390x
> > > with "-n size=65536" xfs many times. Not sure if s390x is needed, I'll
> > > try to check other arches.
> > 
> > XFS (loop3): Metadata corruption detected at __xfs_dir3_data_check+0x372/0x6c0 [xfs], xfs_dir3_block block 0x1020
> > 
> > Where in the source code is this             ^^^^^^^^^^^^^^^^^^^^^^^^^^^?
> 
> The source is just based on HEAD="v6.7-rc6", the decode_stacktrace.sh output is:
>   [  979.227613] XFS (loop3): Metadata corruption detected at __xfs_dir3_data_check (fs/xfs/libxfs/xfs_dir2_data.c:137) xfs, xfs_dir3_block block 0x1020
> 
> The code is as below:
> 
>         switch (hdr->magic) {
>         case cpu_to_be32(XFS_DIR3_BLOCK_MAGIC):
>         case cpu_to_be32(XFS_DIR2_BLOCK_MAGIC):
>                 btp = xfs_dir2_block_tail_p(geo, hdr);
>                 lep = xfs_dir2_block_leaf_p(btp);
> 
>                 if (be32_to_cpu(btp->count) >=
>                     xfs_dir2_data_max_leaf_entries(geo))
> 137: ==>                 return __this_address;
>                 break;
> 
> 
> More output as below:
> 
> [  979.225081] XFS: Assertion failed: 0, file: fs/xfs/xfs_buf_item_recover.c, line: 414
> [  979.225120] ------------[ cut here ]------------
> [  979.225122] WARNING: CPU: 1 PID: 157902 at fs/xfs/xfs_message.c:104 assfail (fs/xfs/xfs_message.c:104 (discriminator 3)) xfs
> [  979.225349] Modules linked in: tls loop lcs ctcm fsm qeth ccwgroup zfcp qdio scsi_transport_fc dasd_fba_mod dasd_eckd_mod dasd_mod rfkill sunrpc vfio_ccw mdev vfio_iommu_type1 vfio drm
> +fuse i2c_core drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng des_s390 virtio_net net_failover sha3_512_s390 virtio_blk failover sha3_256_s390 dm_mirror dm_region_hash dm_log
> +dm_mod pkey zcrypt aes_s390
> [  979.225376] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
> [  979.225377] Krnl PSW : 0704c00180000000 000003ff7ffead7a (assfail+0x52/0x68 xfs])
> [  979.225457]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> [  979.225460] Krnl GPRS: c000000000000021 000003ff800549b0 ffffffffffffffea 000000000000000a
> [  979.225462]            0000038000363418 0000000000000000 000003ff80162b74 000000030001c127
> [  979.225464]            00000000ccc51d40 000000030001c127 00000000dfa03000 0000000163d19600
> [  979.225465]            000003ffa1f2ef68 000003ff80147c28 000003ff7ffead68 00000380003634c0
> [ 979.225476] Krnl Code: 000003ff7ffead6e: 95001010 cli 16(%r1),0
> All code
> ========
> 
> Code starting with the faulting instruction
> ===========================================
> [  979.225476]            000003ff7ffead72: a774000a           brc     7,000003ff7ffead86
> [  979.225476]           #000003ff7ffead76: af000000           mc      0,0
> [  979.225476]           >000003ff7ffead7a: eb6ff0a80004       lmg     %r6,%r15,168(%r15)
> [  979.225476]            000003ff7ffead80: 07fe               bcr     15,%r14
> [  979.225476]            000003ff7ffead82: 47000700           bc      0,1792
> [  979.225476]            000003ff7ffead86: af000000           mc      0,0
> [  979.225476]            000003ff7ffead8a: 0707               bcr     0,%r7
> [  979.225487] Call Trace:
> [  979.225488] assfail (fs/xfs/xfs_message.c:105 (discriminator 3)) xfs
> [  979.225568] assfail+0x40/0x68 xfs])
> [  979.225648] xlog_recover_validate_buf_type (fs/xfs/xfs_buf_item_recover.c:414 (discriminator 1)) xfs
> [  979.225727] xlog_recover_buf_commit_pass2 (fs/xfs/xfs_buf_item_recover.c:971) xfs
> [  979.225807] xlog_recover_items_pass2 (fs/xfs/xfs_log_recover.c:1957) xfs
> [  979.225886] xlog_recover_commit_trans (./include/linux/list.h:373 ./include/linux/list.h:588 fs/xfs/xfs_log_recover.c:2028) xfs
> [  979.225965] xlog_recovery_process_trans (fs/xfs/xfs_log_recover.c:2257) xfs
> [  979.226045] xlog_recover_process_ophdr (fs/xfs/xfs_log_recover.c:2405) xfs
> [  979.226124] xlog_recover_process_data (fs/xfs/xfs_log_recover.c:2447) xfs
> [  979.226204] xlog_recover_process (fs/xfs/xfs_log_recover.c:2901) xfs
> [  979.226285] xlog_do_recovery_pass (fs/xfs/xfs_log_recover.c:3153) xfs
> [  979.226366] xlog_do_log_recovery (fs/xfs/xfs_log_recover.c:3265) xfs
> [  979.226446] xlog_do_recover (fs/xfs/xfs_log_recover.c:3293) xfs
> [  979.226526] xlog_recover (./arch/s390/include/asm/atomic_ops.h:71 (discriminator 1) ./arch/s390/include/asm/bitops.h:63 (discriminator 1) ./include/asm-generic/bitops/instrumented-atomic.h:29 (discriminator 1) fs/xfs/xfs_log_recover.c:3427 (discriminator 1)) xfs
> [  979.226607] xfs_log_mount (fs/xfs/xfs_log.c:718) xfs
> [  979.226686] xfs_mountfs (fs/xfs/xfs_mount.c:822) xfs
> [  979.226765] xfs_fs_fill_super (fs/xfs/xfs_super.c:1734) xfs
> [  979.226844] get_tree_bdev (fs/super.c:1598) 
> [  979.226850] vfs_get_tree (fs/super.c:1771) 
> [  979.226852] do_new_mount (fs/namespace.c:3337) 
> [  979.226855] path_mount (fs/namespace.c:3664) 
> [  979.226857] __s390x_sys_mount (fs/namespace.c:3677 fs/namespace.c:3886 fs/namespace.c:3863 fs/namespace.c:3863) 
> [  979.226860] __do_syscall (arch/s390/kernel/syscall.c:144 arch/s390/kernel/syscall.c:168) 
> [  979.226864] system_call (arch/s390/kernel/entry.S:309) 
> [  979.226868] Last Breaking-Event-Address:
> [  979.226868] xfs_printk_level (fs/xfs/xfs_message.c:52) xfs
> [  979.226949] ---[ end trace 0000000000000000 ]---
> [  979.227613] XFS (loop3): Metadata corruption detected at __xfs_dir3_data_check (fs/xfs/libxfs/xfs_dir2_data.c:137) xfs, xfs_dir3_block block 0x1020
> [  979.227732] XFS (loop3): Unmount and run xfs_repair
> [  979.227733] XFS (loop3): First 128 bytes of corrupted metadata buffer:
> [  979.227736] 00000000: 58 44 42 33 00 00 00 00 00 00 00 00 00 00 10 20  XDB3...........
> [  979.227738] 00000010: 00 00 00 00 00 00 00 00 c1 95 44 38 a1 8d 4b 4a  ..........D8..KJ
> [  979.227739] 00000020: ad 32 0e 29 c4 07 13 ed 00 00 00 00 00 00 00 80  .2.)............
> [  979.227740] 00000030: 02 68 fc c8 00 00 00 00 00 00 00 00 00 00 00 00  .h..............
> [  979.227741] 00000040: 00 00 00 00 00 00 00 80 01 2e 02 00 00 00 00 40  ...............@
> [  979.227742] 00000050: 00 00 00 00 00 00 00 80 02 2e 2e 02 00 00 00 50  ...............P
> [  979.227743] 00000060: 00 00 00 00 00 00 00 83 08 66 73 73 32 36 30 38  .........fss2608
> [  979.227745] 00000070: 36 01 00 00 00 00 00 60 00 00 00 00 00 00 00 84  6......`........
> [  979.227748] XFS (loop3): Corruption of in-memory data (0x8) detected at __xfs_buf_submit+0x78/0x230 [xfs] (fs/xfs/xfs_buf.c:1558). Shutting down filesystem.
> 
> 
> 
> 
> > 
> > Also, does "xfs: update dir3 leaf block metadata after swap" fix it?
> 
> OK, I'll merge and give it a try.

It's still reproducible on xfs-linux for-next branch xfs-6.8-merge-2, which
contains the 5759aa4f9560 ("xfs: update dir3 leaf block metadata after swap")

Thanks,
Zorro

> 
> Thanks,
> Zorro
> 
> > 
> > (I suspect it won't because that's a dirent block, but for-next has a
> > few directory fixes in it.)
> > 
> > --D
> > 
> > > 
> > > Thanks,
> > > Zorro
> > > 
> > > [1]
> > >  [  899.271221] run fstests generic/648 at 2023-12-18 01:18:03
> > >  [  899.454444] XFS (loop1): Mounting V5 Filesystem 9be840e1-9bd1-468e-81c5-179f2ebdce56
> > >  [  899.455844] XFS (loop1): Ending clean mount
> > >  [  899.463623] XFS (loop1): Unmounting Filesystem 9be840e1-9bd1-468e-81c5-179f2ebdce56
> > >  [  899.587365] XFS (dm-3): Mounting V5 Filesystem d636e675-dfc9-4e5d-b777-ff856af136a9
> > >  [  899.588501] XFS (dm-3): Ending clean mount
> > >  ...
> > >  ...
> > >  [  974.217814] XFS (loop3): Mounting V5 Filesystem c1954438-a18d-4b4a-ad32-0e29c40713ed
> > >  [  974.803146] XFS (loop3): Starting recovery (logdev: internal)
> > >  [  974.804634] XFS (loop3): Ending recovery (logdev: internal)
> > >  [  975.782734] Direct I/O collision with buffered writes! File: /p2/df/f4f Comm: fsstress
> > >  [  976.831942] lo_write_bvec: 15 callbacks suppressed
> > >  [  976.831946] loop: Write error at byte offset 1766129664, length 4096.
> > >  [  976.831953] I/O error, dev loop3, sector 3449468 op 0x1:(WRITE) flags 0x9800 phys_seg 1 prio class 2
> > >  [  976.831957] I/O error, dev loop3, sector 3449468 op 0x1:(WRITE) flags 0x9800 phys_seg 1 prio class 2
> > >  [  976.831960] XFS (loop3): log I/O error -5
> > >  [  976.831963] XFS (loop3): Filesystem has been shut down due to log error (0x2).
> > >  [  976.831964] XFS (loop3): Please unmount the filesystem and rectify the problem(s).
> > >  [  976.839603] buffer_io_error: 6 callbacks suppressed
> > >  [  976.839604] Buffer I/O error on dev dm-3, logical block 20971392, async page read
> > >  [  976.839607] Buffer I/O error on dev dm-3, logical block 20971393, async page read
> > >  [  976.839609] Buffer I/O error on dev dm-3, logical block 20971394, async page read
> > >  [  976.839611] Buffer I/O error on dev dm-3, logical block 20971395, async page read
> > >  [  976.839612] Buffer I/O error on dev dm-3, logical block 20971396, async page read
> > >  [  976.839613] Buffer I/O error on dev dm-3, logical block 20971397, async page read
> > >  [  976.839614] Buffer I/O error on dev dm-3, logical block 20971398, async page read
> > >  [  976.839616] Buffer I/O error on dev dm-3, logical block 20971399, async page read
> > >  [  977.419266] XFS (loop3): Unmounting Filesystem c1954438-a18d-4b4a-ad32-0e29c40713ed
> > >  [  977.423981] iomap_finish_ioend: 15 callbacks suppressed
> > >  [  977.423982] dm-3: writeback error on inode 131, offset 1766125568, sector 4059696
> > >  [  977.423988] I/O error, dev loop3, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2
> > >  [  978.434124] XFS (dm-3): Unmounting Filesystem d636e675-dfc9-4e5d-b777-ff856af136a9
> > >  [  978.449156] XFS (dm-3): log I/O error -5
> > >  [  978.449159] XFS (dm-3): Filesystem has been shut down due to log error (0x2).
> > >  [  978.449160] XFS (dm-3): Please unmount the filesystem and rectify the problem(s).
> > >  [  978.449163] XFS (dm-3): log I/O error -5
> > >  [  978.449164] XFS (dm-3): log I/O error -5
> > >  [  978.449165] XFS (dm-3): log I/O error -5
> > >  [  978.449166] XFS (dm-3): log I/O error -5
> > >  [  978.449167] XFS (dm-3): log I/O error -5
> > >  [  978.498053] XFS (dm-3): Mounting V5 Filesystem d636e675-dfc9-4e5d-b777-ff856af136a9
> > >  [  978.513331] XFS (dm-3): Starting recovery (logdev: internal)
> > >  [  978.584227] XFS (dm-3): Ending recovery (logdev: internal)
> > >  [  978.587276] loop3: detected capacity change from 0 to 6876346
> > >  [  978.591588] XFS (loop3): Mounting V5 Filesystem c1954438-a18d-4b4a-ad32-0e29c40713ed
> > >  [  979.216565] XFS (loop3): Starting recovery (logdev: internal)
> > >  [  979.225078] XFS (loop3): Bad dir block magic!
> > >  [  979.225081] XFS: Assertion failed: 0, file: fs/xfs/xfs_buf_item_recover.c, line: 414
> > >  [  979.225120] ------------[ cut here ]------------
> > >  [  979.225122] WARNING: CPU: 1 PID: 157902 at fs/xfs/xfs_message.c:104 assfail+0x4e/0x68 [xfs]
> > >  [  979.225349] Modules linked in: tls loop lcs ctcm fsm qeth ccwgroup zfcp qdio scsi_transport_fc dasd_fba_mod dasd_eckd_mod dasd_mod rfkill sunrpc vfio_ccw mdev vfio_iommu_type1 vfio drm fuse i2c_core drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng des_s390 virtio_net net_failover sha3_512_s390 virtio_blk failover sha3_256_s390 dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt aes_s390
> > >  [  979.225373] CPU: 1 PID: 157902 Comm: mount Kdump: loaded Not tainted 6.7.0-rc6 #1
> > >  [  979.225376] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
> > >  [  979.225377] Krnl PSW : 0704c00180000000 000003ff7ffead7a (assfail+0x52/0x68 [xfs])
> > >  [  979.225457]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> > >  [  979.225460] Krnl GPRS: c000000000000021 000003ff800549b0 ffffffffffffffea 000000000000000a
> > >  [  979.225462]            0000038000363418 0000000000000000 000003ff80162b74 000000030001c127
> > >  [  979.225464]            00000000ccc51d40 000000030001c127 00000000dfa03000 0000000163d19600
> > >  [  979.225465]            000003ffa1f2ef68 000003ff80147c28 000003ff7ffead68 00000380003634c0
> > >  [  979.225476] Krnl Code: 000003ff7ffead6e: 95001010		cli	16(%r1),0
> > >  [  979.225476]            000003ff7ffead72: a774000a		brc	7,000003ff7ffead86
> > >  [  979.225476]           #000003ff7ffead76: af000000		mc	0,0
> > >  [  979.225476]           >000003ff7ffead7a: eb6ff0a80004	lmg	%r6,%r15,168(%r15)
> > >  [  979.225476]            000003ff7ffead80: 07fe		bcr	15,%r14
> > >  [  979.225476]            000003ff7ffead82: 47000700		bc	0,1792
> > >  [  979.225476]            000003ff7ffead86: af000000		mc	0,0
> > >  [  979.225476]            000003ff7ffead8a: 0707		bcr	0,%r7
> > >  [  979.225487] Call Trace:
> > >  [  979.225488]  [<000003ff7ffead7a>] assfail+0x52/0x68 [xfs] 
> > >  [  979.225568] ([<000003ff7ffead68>] assfail+0x40/0x68 [xfs])
> > >  [  979.225648]  [<000003ff80004b4e>] xlog_recover_validate_buf_type+0x2a6/0x5c8 [xfs] 
> > >  [  979.225727]  [<000003ff80005eca>] xlog_recover_buf_commit_pass2+0x382/0x448 [xfs] 
> > >  [  979.225807]  [<000003ff8001077a>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
> > >  [  979.225886]  [<000003ff80011436>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
> > >  [  979.225965]  [<000003ff80011598>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
> > >  [  979.226045]  [<000003ff80011660>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
> > >  [  979.226124]  [<000003ff80012126>] xlog_recover_process_data+0xb6/0x168 [xfs] 
> > >  [  979.226204]  [<000003ff800122dc>] xlog_recover_process+0x104/0x150 [xfs] 
> > >  [  979.226285]  [<000003ff800125a4>] xlog_do_recovery_pass+0x27c/0x748 [xfs] 
> > >  [  979.226366]  [<000003ff80012ec8>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
> > >  [  979.226446]  [<000003ff80012f64>] xlog_do_recover+0x4c/0x218 [xfs] 
> > >  [  979.226526]  [<000003ff8001446a>] xlog_recover+0xda/0x1a0 [xfs] 
> > >  [  979.226607]  [<000003ff7fffa51e>] xfs_log_mount+0x11e/0x280 [xfs] 
> > >  [  979.226686]  [<000003ff7ffec226>] xfs_mountfs+0x3e6/0x928 [xfs] 
> > >  [  979.226765]  [<000003ff7fff3b4c>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
> > >  [  979.226844]  [<00000001ef53bfd4>] get_tree_bdev+0x144/0x1d0 
> > >  [  979.226850]  [<00000001ef539a08>] vfs_get_tree+0x38/0x110 
> > >  [  979.226852]  [<00000001ef56c432>] do_new_mount+0x17a/0x2d0 
> > >  [  979.226855]  [<00000001ef56d0ac>] path_mount+0x1ac/0x818 
> > >  [  979.226857]  [<00000001ef56d81c>] __s390x_sys_mount+0x104/0x148 
> > >  [  979.226860]  [<00000001efbb6e00>] __do_syscall+0x1d0/0x1f8 
> > >  [  979.226864]  [<00000001efbc7088>] system_call+0x70/0x98 
> > >  [  979.226868] Last Breaking-Event-Address:
> > >  [  979.226868]  [<000003ff7ffeabc4>] xfs_printk_level+0xac/0xd8 [xfs]
> > >  [  979.226949] ---[ end trace 0000000000000000 ]---
> > >  [  979.227613] XFS (loop3): Metadata corruption detected at __xfs_dir3_data_check+0x372/0x6c0 [xfs], xfs_dir3_block block 0x1020 
> > >  [  979.227732] XFS (loop3): Unmount and run xfs_repair
> > >  [  979.227733] XFS (loop3): First 128 bytes of corrupted metadata buffer:
> > >  [  979.227736] 00000000: 58 44 42 33 00 00 00 00 00 00 00 00 00 00 10 20  XDB3........... 
> > >  [  979.227738] 00000010: 00 00 00 00 00 00 00 00 c1 95 44 38 a1 8d 4b 4a  ..........D8..KJ
> > >  [  979.227739] 00000020: ad 32 0e 29 c4 07 13 ed 00 00 00 00 00 00 00 80  .2.)............
> > >  [  979.227740] 00000030: 02 68 fc c8 00 00 00 00 00 00 00 00 00 00 00 00  .h..............
> > >  [  979.227741] 00000040: 00 00 00 00 00 00 00 80 01 2e 02 00 00 00 00 40  ...............@
> > >  [  979.227742] 00000050: 00 00 00 00 00 00 00 80 02 2e 2e 02 00 00 00 50  ...............P
> > >  [  979.227743] 00000060: 00 00 00 00 00 00 00 83 08 66 73 73 32 36 30 38  .........fss2608
> > >  [  979.227745] 00000070: 36 01 00 00 00 00 00 60 00 00 00 00 00 00 00 84  6......`........
> > >  [  979.227748] XFS (loop3): Corruption of in-memory data (0x8) detected at __xfs_buf_submit+0x78/0x230 [xfs] (fs/xfs/xfs_buf.c:1558).  Shutting down filesystem.
> > >  [  979.227826] XFS (loop3): Please unmount the filesystem and rectify the problem(s)
> > >  [  979.227912] XFS (loop3): log mount/recovery failed: error -117
> > >  [  979.228061] XFS (loop3): log mount failed
> > >  [  980.224124] XFS (dm-3): Unmounting Filesystem d636e675-dfc9-4e5d-b777-ff856af136a9
> > >  [  980.318601] XFS (loop0): Unmounting Filesystem c8777546-cb21-46f2-9963-456588f60b85
> > >  [  980.353946] XFS (loop0): Mounting V5 Filesystem c8777546-cb21-46f2-9963-456588f60b85
> > >  [  980.355207] XFS (loop0): Ending clean mount
> > >  [  982.585362] XFS (loop1): Mounting V5 Filesystem ec19bbbd-e925-41c9-befe-28218e74a23b
> > >  [  982.586881] XFS (loop1): Ending clean mount
> > >  [  982.588944] XFS (loop1): Unmounting Filesystem ec19bbbd-e925-41c9-befe-28218e74a23b
> > >  [  982.603160] XFS (loop0): EXPERIMENTAL online scrub feature in use. Use at your own risk!
> > >  [  982.620838] XFS (loop0): Unmounting Filesystem c8777546-cb21-46f2-9963-456588f60b85
> > >  [  982.636206] XFS (loop0): Mounting V5 Filesystem c8777546-cb21-46f2-9963-456588f60b85
> > >  [  982.637231] XFS (loop0): Ending clean mount
> > >  
> > > 
> > > 
> > 





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux