Re: [general question] rare silent data corruption when writing data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/05/13 08:31, Chris Dunlop wrote:
Hi,


"Me too!"

We are seeing 256-byte corruptions which are always the last 256b of a 4K block. The 256b is very often a copy of a "last 256b of 4k block" from earlier on the file. We sometimes see multiple corruptions in the same file, with each of the corruptions being a copy of a different 256b from earlier on the file. The original 256b and the copied 256b aren't identifiably at a regular offset from each other. Where the 256b isn't a copy from earlier in the file

I'd be really interested to hear if your problem is just in the last 256b of the 4k block also!

From what I have checked - in my case it has always been full 4k page.

I'll follow the suggestion by Sarah in the other part of this thread and enable pagealloc debug options and then put the machine/disks under load - so I'll keep an eye if something like you described happens.

This will have to wait a bit though, as I have another bug to hunt as well - as journaled raid refuses to assemble, so with help of Song I'm chasing that issue first.

If not for btrfs, we probably would have been using the machine happily until now (blaming occasional detected issues on userspace stuff, usually some fat java mess).

Thanks for detailed explanations of what happened in your case (and the span of kernel versions in which it does happen is scary). The hardware indeed looks strikingly similiar.



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux