On Thu, Jan 18, 2024 at 03:20:21PM +1100, Dave Chinner wrote: > On Mon, Dec 18, 2023 at 10:01:34PM +0800, Zorro Lang wrote: > > Hi, > > > > Recently I hit a crash [1] on s390x with 64k directory block size xfs > > (-n size=65536 -m crc=1,finobt=1,reflink=1,rmapbt=0,bigtime=1,inobtcount=1), > > even not panic, a assertion failure will happen. > > > > I found it from an old downstream kernel at first, then reproduced it > > on latest upstream mainline linux (v6.7-rc6). Can't be sure how long > > time this issue be there, just reported it at first. > > [ 978.591588] XFS (loop3): Mounting V5 Filesystem c1954438-a18d-4b4a-ad32-0e29c40713ed > > [ 979.216565] XFS (loop3): Starting recovery (logdev: internal) > > [ 979.225078] XFS (loop3): Bad dir block magic! > > [ 979.225081] XFS: Assertion failed: 0, file: fs/xfs/xfs_buf_item_recover.c, line: 414 > > Ok, so we got a XFS_BLFT_DIR_BLOCK_BUF buf log item, but the object > that we recovered into the buffer did not have a > XFS_DIR3_BLOCK_MAGIC type. > > Perhaps the buf log item didn't contain the first 128 bytes of the > buffer (or maybe any of it), and so didn't recovery the magic number? > > Can you reproduce this with CONFIG_XFS_ASSERT_FATAL=y so the failure > preserves the journal contents when the issue triggers, then get a > metadump of the filesystem so I can dig into the contents of the > journal? I really want to see what is in the buf log item we fail > to recover. > > We don't want recovery to continue here because that will result in > the journal being fully recovered and updated and so we won't be > able to replay the recovery failure from it. > > i.e. if we leave the buffer we recovered in memory without failure > because the ASSERT is just a warn, we continue onwards and likely > then recover newer changes over the top of it. This may or may > not result in a correctly recovered buffer, depending on what parts > of the buffer got relogged. > > IOWs, we should be expecting corruption to be detected somewhere > further down the track once we've seen this warning, and really we > should be aborting journal recovery if we see a mismatch like this. > > ..... > > > [ 979.227613] XFS (loop3): Metadata corruption detected at __xfs_dir3_data_check+0x372/0x6c0 [xfs], xfs_dir3_block block 0x1020 > > [ 979.227732] XFS (loop3): Unmount and run xfs_repair > > [ 979.227733] XFS (loop3): First 128 bytes of corrupted metadata buffer: > > [ 979.227736] 00000000: 58 44 42 33 00 00 00 00 00 00 00 00 00 00 10 20 XDB3........... > > XDB3 is XFS_DIR3_BLOCK_MAGIC, so it's the right type, but given it's > the tail pointer (btp->count) that is bad, this indicates that maybe > the tail didn't get written correctly by subsequent checkpoint > recoveries. We don't know, because that isn't in the output below. > > It likely doesn't matter, because I think the problem is either a > runtime problem writing bad stuff into the journal, or a recovery > problem failing to handle the contents correctly. Hence the need for > a metadump. Hi Dave, Thanks for your reply. It's been a month passed, since I reported this bug last time. Now I can't reproduce this issue on latest upstream mainline linux and xfs-linux for-next branch. I've tried to do the same testing ~1000 times, still can't reproduce it... If you think it might not be fixed but be hided, I can try it on older kernel which can reproduce this bug last time, to get a metadump. What do you think? Thanks, Zorro > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >