On Tue, Feb 04, 2003 at 04:43:17PM -0000, Bodrogi Viktor wrote: > > > Do You know about if there is a mode switch for RAID-1 setup (my case is > > > evms-raid) to do this comparision? > > > This makes sense as an option for debuging and for high availability > > > production also. > > > > No there isn't, on any RAID systems that I'm aware of. > > This really breaks my confidence in RAID-1 mirrors. > Would the situation get better with a four disk RAID-5? > As I imagine, it should... Nope. RAID-5 has a "parity stripe", yes, but it's not used to protect against errors. It's used to rebuild the RAID array after a disk failure. Requesting two blocks from two different disk drives would require extra memory (you need a place to store the extra disk block), consume memory bandwidth and CPU time to do the block compare, and increase overall latency (since you have to wait for both disk blocks to be received and compared before the user application can touch the page). I don't know of any RAID system that has been willing to design in the extra complexity, even as a "debugging" option. Keep in mind that the RAID design comes from high-end systems where performance is emphasized, and the only thing that required protection was the outright failure of the disk drive itself. Things like CRC or other checksums were presumed to protect against data errors. In your particular case, where you told us that you were seeing data from other files appearing in the wrong place, my guess is that it's the actual block address which is getting corrupted, not the data being downloaded. If I recall correctly, IDE UDMA protects the data blocks being transferred using a CRC, but I don't believe the IDE command block itself is protected, and that's probably how you're getting screwed; if that gets corrupted, then the disk drive will send the wrong disk block back in response to a read request. > If this phenomena is HW error, should it be logged anywhere? > I didn't find anything in syslog... Well, if it is a corrupted block/sector number, it won't get logged because the HW isn't noticing that something has gone wrong. It would be odd, though, that just the request address was getting corrupted and nothing else would be, if it were a cable fault. A some kind of weird fault in the controller or the disk drives themselves might explain these results, though. - Ted _______________________________________________ Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users