On 9/3/15 6:09 AM, Danny Shavit wrote: > Hi Dave, > > We couple of more xfs corruption that we would like to share: On the same box as the one that seemed to be experiencing some bit-flips in your earlier email? As a general note: You are not providing enough information for us to effectively help you. http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F Kernel version? xfsprogs version? At a bare minimum... Your dmesg snippets are edited. You've provided what you feel is important, omitting the parts that may actually be important or informational. You haven't described the sequence of events that led to these issues. You haven't made clear what these attachments are; which repair log goes with which kernel event? Etc... > 1. This is an interesting one, since xfs reported corruption but when > running xfs_repair, no error was found. Attached is the kernel log > section regarding the corruption (6458). Does xfs_repair explicitly > read data from the disk? In such case it might be a memory > corruption. Are you familiar with such cases? Yes, xfs_repair opens the block device O_DIRECT. your 6485-kernel.log shows a failure in xfs_allocbt_verify(), right after the allocation btree is read from disk. i.e. this is an in-kernel metadata consistency check that is failing. It also shows: kworker/0:1H Tainted: GF W So it's tainted: 2: 'F' if any module was force loaded by "insmod -f", ' ' if all modules were loaded normally. 10: 'W' if a warning has previously been issued by the kernel. (Though some warnings may set more specific taint flags.) You force-loaded a module? And previous warnings were emitted (though we can't see them in your edited dmesg). All bets are off. If you had included the full dmesg, we might know more about what's going on, at least. > 2. xfs corruption occurred suddenly with no apparent external event. > Attached are xfs_repair and kernel logs are. Xfs dump can be found > in: https://zadarastorage-public.s3.amazonaws.com/xfs/82.metadump.gz Your 6442-82-xfs_repair.log is from an xfs_repair -L, so of course it is finding corruption, and the output is more or less meaningless from a triage POV. Repair said: > Note that destroying the log may cause corruption -- please attempt a mount > of the filesystem before doing this. Why did you run it with -L? Did mount fail? If so how? dm-82-kernel.log also shows a failing verifier, this time xfs_bmbt_verify, when reading metadata from disk. You've truncated other parts, though: Aug 22 23:24:48 vsa-00000110-vc-0 kernel: [4194599.685353] ffff88010ec36000: ea bb 12 3a 5f 44 01 a8 b9 2a 80 10 b3 a7 d5 af ...:_D...* ...... so there's not a ton to go on, just hints that there is more information that's not provided. -Eric _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs