On Wed, Jun 25, 2014 at 06:49:39PM +0200, Jan Kara wrote: > Hello, > > so I've been crash-testing XFS (just killing KVM with XFS filesystem > mounted) a bit with V5 superblock enabled in 3.16-rc1 and I can pretty > easily hit CRC mismatches after that. Kernel complains like: Yes, we had that reported yesterday by another person, using virtual box. I've been unable to reproduce it on my local KVM VMs, so I'm wondering what your configuration KVM configuration is? > [518184.794175] XFS (sdb3): Mounting V5 Filesystem > [518184.902898] XFS (sdb3): Starting recovery (logdev: internal) > [518187.118860] XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1 > [518187.118870] XFS (sdb3): Unmount and run xfs_repair > [518187.118875] XFS (sdb3): First 64 bytes of corrupted metadata buffer: > [518187.118882] ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40 XAGF...........@ > [518187.118887] ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01 ..mS..w......... > [518187.118891] ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03 ................ > [518187.118895] ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00 ................ > [518187.118923] XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1 > > So it seem like the checksum doesn't get updated properly in all the cases. > Looking into the logdump, there doesn't seem to be any modifications for > this AGF block in unrelayed part of the log but there are some modifications > in the older parts of the log - the latest LSN where block 1 was updated is > 1,4639 (and the buffer contents in the log corresponds to the data I see in > block 1). However the lsn field in AGF structure in that block shows 1,3616 > so that really seems stale (and I've checked and in that transaction the > block has been modified as well). That tallies with what has been reported -it was the AGI block, however. What I know so far is that the CRC matches for the version of the structure logged at the apparent LSN, but the data is more recent. Now the only way I can see the data getting updated without the LSN being updates is through log recovery, the analysis is here: http://oss.sgi.com/pipermail/xfs/2014-June/036938.html At the bottom of the email is a request for a information resulting from a reproduction cycle. Can you run that cycle and provide the metadumps and dmesg when a problem is first found? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs