On 7/30/18 5:02 AM, Filippo Giunchedi wrote: > On Sun, Jul 22, 2018 at 2:03 AM Eric Sandeen <sandeen@xxxxxxxxxxx> wrote: >> On 7/20/18 3:20 AM, Filippo Giunchedi wrote: >>> To recap what we've seen, hardware bit flipping is extremely unlikely: >>> the same type of sb_fdblocks corruption has appeared on four different >>> hosts affecting at most one third of xfs filesystems per host. Also >>> the corruption looks always the same, namely the 33rd bit flipped >>> which also seems suspicious. >> >> Running a debug kernel with memory poisoning, KASAN, or something similar might >> help catch it if it's a stray memory write of some sort... > > Thanks! BTW we've experienced this again on a FS at around 77% usage > and xfs_repair reports a flip in the 32nd bit (output below). We'll > enable memory poisoning on said host and see if other filesystems on > that host experience the same. > I see in the patch thread it has been mentioned this particular > condition will be checked and fixed/prevented in 4.19 though the root > cause isn't know (?) Yeah, the validation should happen in 4.19, but no idea what the root cause is. Catching it at write time may offer some clues, I hope, if the debug kernel doesn't help. -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html