Re: Determining if a stripe/RAID0 has failed

Curtis <serverascode@xxxxxxxxx> · Tue, 9 Jul 2013 15:52:30 -0600

On Tue, Jul 9, 2013 at 3:46 PM, NeilBrown <neilb@xxxxxxx> wrote:
> On Tue, 9 Jul 2013 15:33:29 -0600 Curtis <serverascode@xxxxxxxxx> wrote:
>
>> Hi All,
>>
>> I'm wondering what the best way to determine when a RAID0 has failed?
>>
>> We have some stateless servers that use a stripe/RAID0, but we'll need
>> to know if it failed so we can pull it out of the "cluster" and
>> rebuild it. It would be better to find out sooner than later that the
>> stripe has failed.
>>
>> I know from reading the man page that I can't use mdadm to monitor the
>> stripe. Is it basically just that the device becomes unusable in some
>> fashion?
>>
>
> How would you determine if a lone drive had failed?
> Presumably by error messages in the kernel logs, or similar.
> Use exactly the same mechanism to test if a RAID0 has failed.

Ok, that makes total sense, thanks. :)

>
> (A "RAID0" doesn't fail as whole.  Bits of it might, other bits might keep
> working, just like a drive which can lose some sectors but other sectors keep
> working.  Certainly a whole drive can fail if it's logic-board dies.
> Similarly a whole RAID0 can fail if the SATA/SCSI/USB controller dies.)

Noted.

Thanks again,
Curtis.

>
> NeilBrown

--
Twitter: @serverascode
Blog: serverascode.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html