Re: mdadm raid-check

Valeri Galtsev <galtsev@xxxxxxxxxxxxxxxxx> · Mon, 16 Nov 2020 09:28:04 -0600

> On Nov 16, 2020, at 2:48 AM, hw <hw@xxxxxxxx> wrote:
> 
> On Sat, 2020-11-14 at 21:55 -0600, Valeri Galtsev wrote:
>>> On Nov 14, 2020, at 8:20 PM, hw <hw@xxxxxxxx> wrote:
>>> 
>>> 
>>> Hi,
>>> 
>>> is it required to run /usr/sbin/raid-check once per week?  Centos 7 does
>>> this.  Maybe it's sufficient to run it monthly?  IIRC Debian did it monthly.
>> 
>> On hardware RAIDs I do RAID verification once a week. Once a Month a
>> not often enough in my book. That RAID verification effectively
>> reads all stripes of all drives (and verifies that content of
>> redundant drives is consistent), thus preventing a “time bomb”, when
>> a drive left alone for too long, ready to fail in an area which is
>> not accessed, and failing when at some point different drive was
>> replaced and RAID rebuild has to go over all stripes of all
>> drives. Such “multiple failures” are due to poor sysadmin’s work:
>> not often enough RAID verification.
> 
> You mean there can be failures which can be detected during a
> raid-check and can still be repaired using the other disk, but they
> can be impossible to repair when a disk has failed?

No, what I meant to say is: the errors could have been detected, and the drive would be kicked out of RAID (not errors repaired), and replaced with good drive long ago. But if RAID is not being checked often, there is potential that more than redundancy number of drives are failed (in different areas) and are waiting to be kicked out, and when it happens the failure becomes fatal.

>> If software raid-check does the same, then it makes a lot of sense,
>> and I am more with RedHat's weekly cron job, than with Debian’s
>> Monthly.
> 
> How often do partial failures occur during normal operation?

I do not know what you mean by “partial failures”. I can imagine:

1. checksum does not match, no reason to suspect any of drives which wrong information comes from. If it is RAID-6, in assumption that only one drive provided wrong information, wrong drive can be pinpointed, and stripe on it overwritten, the event is over without data messed up. If it is RAID-5, there is no way to pinpoint wrong drive, if your setting in RAID firmware (I am speaking only about hardware RAIDs here) is to overwrite “parity”, fair chance is stripe on drive that gave correct information is overwritten, and the content on RAID device is damaged.

2. checksum does not match and one of the drives responded with significant delay. If there is no other way to pinpoint which drive wrong information came from, drive with delay can be fair suspect to be the one (it had to take time to multiple times read “bad block” and maybe re-allocate it). With fair certainty (but not 100%), RAID will handle the situation without data corruption.

3. One of the drives timed out or reported I/O error. The drive will be kicked out of RAID, and it is on operator’s side to decide whether to replace it or to attempt to rebuild RAID onto the same drive.

>  In case
> there was a power failure, it's probably a good idea to do a check
> anyway.

If you care about data on your RAID, you will use battery backup unit, which will keep the content of volatile RAM cache without loosing it, so when power has returned, the cache can be flashed to the drives. (Without cache hardware RAID devices are noticeably slower than with cache enabled). [non-volatile caches and supercapacitors are used as well]

However, the drives themselves have volatile memory as cache, that will evaporate when power suddenly disappears. To make things worse: drives are designed to lie about “transaction complete” (thus manufacturers can declare better specs than those of competitors), and “transaction complete” is reported when data is still in drives volatile cache, not on the platters. As far as I know, there is no way to query drive to get honest answer whether data is already on platters or not. Therefore, hardware RAID cards may think some of transactions are completed, but they may never become completed in case of power loss.

So, when power suddenly goes… it potentially is a mess on I/O intense box. Even with RAID battery backup for cache (or RAID cache disabled), having machine behind UPS, and starting clean shutdown when battery in UPS has less than [3 minutes in my case, yours may be different] juice left is a good idea.

I hope, this helps.

Valeri

>> Valeri
>> 
>>> I just checked on Fedora 32.  It does not run raid-check at all, at least not
>>> via a cron entry.  /usr/sbin/raid-check is available, though.  Is that an
>>> oversight?  (I started it manually now and will check if it's run once I update
>>> to 33.)
> 
> _______________________________________________
> CentOS mailing list
> CentOS@xxxxxxxxxx
> https://lists.centos.org/mailman/listinfo/centos

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos