Reliability of RAID 5 repair function (mismatch_cnt 9560824)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello there,

I am running a RAID 5 consisting of 4 Seagate 4TB NAS Drives ST4000VN000 for 4 years now. The raid device is "scrubbed" every month using the "check" function. There was never a problem. The filesystem is a journaled ext4.

Last week I added another external backup drive, and after a reboot, I was missing disk 4 (sdd) of the RAID. It was physically turned on, no error in the logs, but md0 was degraded. SMART data are fine. I added it back manually, and since I use a bitmap, it was accepted immediately. I run a "check" or scrub afterwards which went fine.

Anyway, after some heavy copy actions on the raid, I moved about 1/3 of the data to the new backup drive, since I do not need it on the RAID. After another reboot, the mount process failed, reported the fsck was not clean. I started a fsck, but this one was reporting massive inode errors ... so I stopped it, to run another "check" on the RAID, which gave me a mismatch_cnt 9560824, which seems to be quite high.

Right now I can mount the filesystem read-only, but two important directories, which I didn't touch for almost 2 years are gone. I can not explain what went wrong.

I read and understood
https://raid.wiki.kernel.org/index.php/Scrubbing_the_drives
"With a raid-5 array the only thing that can be done when there is an error is to correct the parity. This is also the most likely error - the scenario where the data has been flushed and the parity not updated is the expected cause of problems like this."

Is there any way to detect which drive has a problem? Of course I suspect drive 4. How reliable is the repair function of mdadm? I want to make sure, the RAID integrity is OK before I try to recover data from the filesystem, which is probably quite a big next step. Otherwise I may consider to try a repair with one drive 1-3 assembled in the RAID.

Many many thanks for any hints in understanding the situation.
Michael




--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux