Re: Linux Software RAID a bit of a weakness?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is the most useful thing I have found in a long time!

p34:~# echo check > /sys/block/md0/md/sync_action
$ cat /sys/block/md[0-4]/md/mismatch_cnt
512
0
0
0
0

Wow!

Justin.

On Fri, 23 Feb 2007, Steve Cousins wrote:

Colin Simpson wrote:
Hi, We had a small server here that was configured with a RAID 1 mirror, using two IDE disks. Last week one of the drives failed in this. So we replaced the drive and
set the array to rebuild. The "good" disk then found a bad block and the
mirror failed.

Now I presume that the "good" disk must have had an underlying bad block
in either unallocated space or a file I never access. Now as RAID works
at the block level you only ever see this on an array rebuild when it's
often catastrophic. Is this a bit of a flaw? I know there is the definite probability of two drives failing within a
short period of time. But this is a bit different as it's the
probability of two drives failing but over a much larger time scale if
one of the flaws is hidden in unallocated space (maybe a dirt particle
finds it's way onto the surface or something). This would make RAID buy
you a lot less in reliability, I'd have thought. I seem to remember seeing in the log file for a Dell perc something
about scavenging for bad blocks. Do hardware RAID systems have a
mechanism that at times of low activity search the disks for bad blocks
to help guard against this sort of failure (so a disk error is reported
early)?

On Software RAID, I was thinking apart from a three way mirror, which I
don't think is at present supported. Is there any merit in say, cat'ing
the whole disk devices to /dev/null every so often to check that the
whole surface is readable (I presume just reading the raw device won't
upset thing, don't worry I don't plan on trying it on a production
system). Any thoughts? As I presume people have thought of this before and I must
be missing something.

Yes, this is an important thing to keep on top of, both for hardware RAID and software RAID. For md:

	echo check > /sys/block/md0/md/sync_action

This should be done regularly. I have cron do it once a week.

Check out: http://neil.brown.name/blog/20050727141521-002

Good luck,

Steve
--
______________________________________________________________________
Steve Cousins, Ocean Modeling Group    Email: cousins@xxxxxxxxxxxxxx
Marine Sciences, 452 Aubert Hall       http://rocky.umeoce.maine.edu
Univ. of Maine, Orono, ME 04469        Phone: (207) 581-4302


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux