Re: Why does one get mismatches?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Brown wrote:
On Wed, 17 Feb 2010 08:38:11 +1100
Steven Haigh <netwiz@xxxxxxxxx> wrote:

On Tue, 16 Feb 2010 16:25:25 -0500, Bill Davidsen <davidsen@xxxxxxx>
wrote:
Bryan Mesich wrote:
On Thu, Feb 11, 2010 at 04:14:44PM +1100, Neil Brown wrote:
This whole discussion simply shows that for RAID-1 software RAID is
less
reliable than hardware RAID (no, I don't mean fake-RAID), because it doesn't pin the data buffer until all copies are written.
That doesn't make it less reliable.  It just makes it more confusing.
I agree that linux software RAID is no less reliable than
hardware RAID with regards to the above conversation.  It's
however confusing to have a counter that indicates there are
problems with a RAID 1 array when in fact there is not.
Sorry, but real hardware raid is more reliable than software raid, and Neil's justification for not doing smart recovery mentions it. Note this referes to real hardware raid, not fakeraid which is just some firmware in a BIOS to use the existing hardware.

The issue lies with data changing between write to multiple drives. In hardware raid the data traverses the memory bus once, only once, and goes into cache in the controller, from which it is written to all mirrored drives. With software raid an individual write is done to each drive, and if the data in the buffer changes between writes to one drive or the other you get different values. Neil may be convinced that the OS somehow "knows" which of the mirror copies is correct, ie. most recent, and never uses the stale data, but if that information was really available reads would always return the latest value and it wouldn't be possible to read the same file multiple times and get different MD5sums. It would also be possible to do a stable smart recovery by propagating the most recent copy to the other mirror drives.

I hoped that mounting data=journal would lead to consistency, that seems
not to be true either.
I agree Bill, there is an issue with the software RAID1 when it comes down
to some hardware. I have one machine where the ONLY way to stop the root
filesystem going readonly due to journal issues is to remove RAID. Having
RAID1 enabled gives silent corruption of both data and the journal at
seemingly random times.

I can see the data corruption from running a verify between RPM and data
on the drive. Reinstalling these packages fixes things - until something
random things get corrupted next time.

Sounds very much like dodgy drives.

The myth that data corruption in RAID1 ONLY happens to swap and/or unused
space on a drive is absolute rubbish.


Absolute rubbish does seem to be a suitable phrase here.
There is no question of data corruption.
When memory changes between being written to one device and to another, this
does not cause corruption, only inconsistency.   Either the block will be
written again consistently soon, or it will never be read.

Just what is it that rewrites the data block? The user program doesn't know it's needed, the filesystem, if any, doesn't know it's needed, and as far as I can tell md doesn't do checksum before issuing the write and after the last write is done. Doesn't make a copy and write from that. So what sees that the data has changed and rewrites it?

If the host crashes before the blocks are made consistent, then the inconsistency will not be visible as the resync will fix it.

If you are getting any corruption, then it is NOT due to this facet of the
RAID1 implementation - it due to something else.
My guess is bad hardware - anywhere from memory to hard drive.

Having switched an array from three way raid-1 to raid-6, using the same kernel, utilities, and hardware, I can speak to that. When I first started to run checks, I took the array offline to do repair, and usually saw ~12k mismatches by the end of a week. After changing the array to raid-6 I never had a mismatch again. Therefore, while hardware clearly can be a factor, it is unlikely to be the cause of all mismatch events.

--
Bill Davidsen <davidsen@xxxxxxx>
 "We can't solve today's problems by using the same thinking we
  used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux