Hi folks, I was playing around with some blockchain projects yesterday and had some curious crashes while syncing blockchain databases on XFS filesystems under kernel 6.3. * kernel 6.3.0 and 6.3.1 (ubuntu mainline) * w/ and w/o the discard mount flag * w/ and w/o -m crc=0 * ironfish (nodejs) and ergo (jvm) The hardware is as follows: * Asus PRIME H670-PLUS D4 * Intel Core i5-12400 * 32GB DDR4-3200 Non-ECC UDIMM In all cases the filesystems were newly-created under kernel 6.3 on an LVM2 stripe and mounted with the noatime flag. Here is the output of the mkfs.xfs command (after reverting back to 6.2.14—which I realize may not be the most helpful thing, but here it is anyway): $ sudo lvremove -f vgtethys/ironfish $ sudo lvcreate -n ironfish-L 10G -i2 vgtethys /dev/nvme[12]n1p3 Using default stripesize 64.00 KiB. Logical volume "ironfish" created. $ sudo mkfs.xfs -m crc=0 -m uuid=b4725d43-a12d-42df-981a-346af2809fad -s size=4096 /dev/vgtethys/ironfish meta-data=/dev/vgtethys/ironfish isize=256 agcount=16, agsize=163824 blks = sectsz=4096 attr=2, projid32bit=1 = crc=0 finobt=0, sparse=0, rmapbt=0 = reflink=0 bigtime=0 inobtcount=0 data = bsize=4096 blocks=2621184, imaxpct=25 = sunit=16 swidth=32 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 Discarding blocks...Done. The applications crash with I/O errors. Here's what I see in dmesg: May 01 18:56:59 tethys kernel: XFS (dm-28): Internal error bno + len > gtbno at line 1908 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_ag_extent+0x14e/0x950 [xfs] May 01 18:56:59 tethys kernel: CPU: 2 PID: 48657 Comm: node Tainted: P OE 6.3.1-060301-generic #202304302031 May 01 18:56:59 tethys kernel: Hardware name: ASUS System Product Name/PRIME H670-PLUS D4, BIOS 2014 10/14/2022 May 01 18:56:59 tethys kernel: Call Trace: May 01 18:56:59 tethys kernel: <TASK> May 01 18:56:59 tethys kernel: dump_stack_lvl+0x48/0x70 May 01 18:56:59 tethys kernel: dump_stack+0x10/0x20 May 01 18:56:59 tethys kernel: xfs_corruption_error+0x9e/0xb0 [xfs] May 01 18:56:59 tethys kernel: ? xfs_free_ag_extent+0x14e/0x950 [xfs] May 01 18:56:59 tethys kernel: xfs_free_ag_extent+0x17c/0x950 [xfs] May 01 18:56:59 tethys kernel: ? xfs_free_ag_extent+0x14e/0x950 [xfs] May 01 18:56:59 tethys kernel: __xfs_free_extent+0xee/0x1e0 [xfs] May 01 18:56:59 tethys kernel: xfs_trans_free_extent+0xad/0x1a0 [xfs] May 01 18:56:59 tethys kernel: xfs_extent_free_finish_item+0x14/0x40 [xfs] May 01 18:56:59 tethys kernel: xfs_defer_finish_one+0xd9/0x280 [xfs] May 01 18:56:59 tethys kernel: xfs_defer_finish_noroll+0xab/0x280 [xfs] May 01 18:56:59 tethys kernel: xfs_defer_finish+0x16/0x80 [xfs] May 01 18:56:59 tethys kernel: xfs_itruncate_extents_flags+0xe3/0x270 [xfs] May 01 18:56:59 tethys kernel: xfs_free_eofblocks+0xe3/0x130 [xfs] May 01 18:56:59 tethys kernel: xfs_release+0x153/0x190 [xfs] May 01 18:56:59 tethys kernel: xfs_file_release+0x15/0x20 [xfs] May 01 18:56:59 tethys kernel: __fput+0x95/0x270 May 01 18:56:59 tethys kernel: ____fput+0xe/0x20 May 01 18:56:59 tethys kernel: task_work_run+0x5e/0xa0 May 01 18:56:59 tethys kernel: exit_to_user_mode_loop+0x136/0x160 May 01 18:56:59 tethys kernel: exit_to_user_mode_prepare+0xff/0x110 May 01 18:56:59 tethys kernel: syscall_exit_to_user_mode+0x1b/0x50 May 01 18:56:59 tethys kernel: do_syscall_64+0x67/0x90 May 01 18:56:59 tethys kernel: ? syscall_exit_to_user_mode+0x44/0x50 May 01 18:56:59 tethys kernel: ? do_syscall_64+0x67/0x90 May 01 18:56:59 tethys kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc May 01 18:56:59 tethys kernel: RIP: 0033:0x7f8fce72c6a7 May 01 18:56:59 tethys kernel: Code: 44 00 00 48 8b 15 e9 d7 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 > May 01 18:56:59 tethys kernel: RSP: 002b:00007f8fb2a67a78 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 May 01 18:56:59 tethys kernel: RAX: 0000000000000000 RBX: 00007f8f98019420 RCX: 00007f8fce72c6a7 May 01 18:56:59 tethys kernel: RDX: 00007f8fce806880 RSI: 00007f8f982a9b40 RDI: 000000000000004c May 01 18:56:59 tethys kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f8fc02c5520 May 01 18:56:59 tethys kernel: R10: 0000000000000064 R11: 0000000000000202 R12: 00007f8fce807480 May 01 18:56:59 tethys kernel: R13: 0000000000006be1 R14: 0000000000000019 R15: 00007f8f980a8b50 May 01 18:56:59 tethys kernel: </TASK> May 01 18:56:59 tethys kernel: XFS (dm-28): Corruption detected. Unmount and run xfs_repair May 01 18:56:59 tethys kernel: XFS (dm-28): Corruption of in-memory data (0x8) detected at xfs_defer_finish_noroll+0x130/0x280 [xfs] (fs/xfs/libxfs/xfs_defer.c:573). Shutting down filesystem. May 01 18:56:59 tethys kernel: XFS (dm-28): Please unmount the filesystem and rectify the problem(s) And here's what I see in dmesg after rebooting and attempting to mount the filesystem to replay the log: May 01 21:34:15 tethys kernel: XFS (dm-35): Metadata corruption detected at xfs_inode_buf_verify+0x168/0x190 [xfs], xfs_inode block 0x1405a0 xfs_inode_buf_verify May 01 21:34:15 tethys kernel: XFS (dm-35): Unmount and run xfs_repair May 01 21:34:15 tethys kernel: XFS (dm-35): First 128 bytes of corrupted metadata buffer: May 01 21:34:15 tethys kernel: 00000000: 5b 40 e2 3a ae 52 a0 7a 17 1d 5a f6 f0 de 4c 62 [@.:.R.z..Z...Lb May 01 21:34:15 tethys kernel: 00000010: d6 31 8b 51 ca 6e ad a2 7e f5 18 65 6e 8a 41 3f .1.Q.n..~..en.A? May 01 21:34:15 tethys kernel: 00000020: 68 b5 02 16 2c 84 5d 33 ac 46 fc c9 da 93 af 3f h...,.]3.F.....? May 01 21:34:15 tethys kernel: 00000030: a0 3e b7 9c b4 99 5a 45 8c 2f 13 ed bb 07 57 e1 .>....ZE./....W. May 01 21:34:15 tethys kernel: 00000040: bc 96 aa d7 00 2a 81 65 e6 3b 86 9d b5 0a 63 bd .....*.e.;....c. May 01 21:34:15 tethys kernel: 00000050: 38 e5 63 1a 09 42 36 4c b8 e8 7c 92 73 01 04 da 8.c..B6L..|.s... May 01 21:34:15 tethys kernel: 00000060: 27 df 43 92 b1 ad ba ec 7a 02 3f 8e 84 3a bb cc '.C.....z.?..:.. May 01 21:34:15 tethys kernel: 00000070: 39 06 74 d1 8b 04 b7 f2 62 c1 c4 f0 3c 5c 54 4f 9.t.....b...<\TO May 01 21:34:15 tethys kernel: XFS (dm-35): metadata I/O error in "xlog_recover_items_pass2+0x56/0xf0 [xfs]" at daddr 0x1405a0 len 32 error 117 May 01 21:34:15 tethys kernel: XFS (dm-35): log mount/recovery failed: error -117 May 01 21:34:15 tethys kernel: XFS (dm-35): log mount failed Blockchain projects tend to generate pathological filesystem loads; the sustained random write activity and constant (re)allocations must be pushing on some soft spot here. Reverting to kernel 6.2.14 and recreating the filesystems seems to have resolved the issue—so far, at least—but obviously this is less than ideal. If someone would be willing to provide a targeted listed of desired artifacts I'd be happy to boot back into kernel 6.3.1 to reproduce the issue and collect them. Alternatively I can try to eliminate some variables (like LVM2, potential hardware instabilities, etc.) and provide step-by-step directions for reproducing the issue on another machine. Thank you, Mike