On Sun, Jan 21, 2024 at 10:58:49AM +1100, Dave Chinner wrote: > On Sat, Jan 20, 2024 at 07:26:00PM +0800, Zorro Lang wrote: > > On Fri, Jan 19, 2024 at 06:17:24PM +1100, Dave Chinner wrote: > > > Perhaps a bisect from 6.7 to 6.7+linux-xfs/for-next to identify what > > > fixed it? Nothing in the for-next branch really looks relevant to > > > the problem to me.... > > > > Hi Dave, > > > > Finally, I got a chance to reproduce this issue on latest upstream mainline > > linux (HEAD=9d64bf433c53) (and linux-xfs) again. > > > > Looks like some userspace updates hide the issue, but I haven't found out what > > change does that, due to it's a big change about a whole system version. I > > reproduced this issue again by using an old RHEL distro (but the kernel is the newest). > > (I'll try to find out what changes cause that later if it's necessary) > > > > Anyway, I enabled the "CONFIG_XFS_ASSERT_FATAL=y" and "CONFIG_XFS_DEBUG=y" as > > you suggested. And got the xfs metadump file after it crashed [1] and rebooted. > > > > Due to g/648 tests on a loopimg in SCRATCH_MNT, so I didn't dump the SCRATCH_DEV, > > but dumped the $SCRATCH_MNT/testfs file, you can get the metadump file from: > > > > https://drive.google.com/file/d/14q7iRl7vFyrEKvv_Wqqwlue6vHGdIFO1/view?usp=sharing > > Ok, I forgot the log on s390 is in big endian format. I don't have a > bigendian machine here, so I can't replay the log to trace it or > find out what disk address the buffer belongs. I can't even use > xfs_logprint to dump the log. > > Can you take that metadump, restore it on the s390 machine, and > trace a mount attempt? i.e in one shell run 'trace-cmd record -e > xfs\*' and then in another shell run 'mount testfs.img /mnt/test' The 'mount testfs.img /mnt/test' will crash the kernel and reboot the system directly ... > and then after the assert fail terminate the tracing and run > 'trace-cmd report > testfs.trace.txt'? ... Can I still get the trace report after rebooting? Thanks, Zorro > > The trace will tell me what buffer was being replayed when the > failure occurred, and from there I can look at the raw dump of the > log and the buffer on disk and go from there... > > > [ 1707.044730] XFS (loop3): Mounting V5 Filesystem 59e2f6ae-ceab-4232-9531-a85417847238 > > [ 1707.061925] XFS (loop3): Starting recovery (logdev: internal) > > [ 1707.079549] XFS (loop3): Bad dir block magic! > > At minimum, this error message will need to be improved to tell us > what buffer failed this check.... > > -Dave. > > -- > Dave Chinner > david@xxxxxxxxxxxxx >