Re: Logging-Loop when a drive in a raid1 fails.

Michael Renner <michael.renner@xxxxxxxxxxx> · Tue, 14 Dec 2004 11:19:21 +0100

Paul Clements wrote:
Michael Renner wrote:

one of the drives in a software raid1 failed, on a machine running 
2.6.9-rc2, leading to this "logging-spree" (see attachment).

Sorry if this has been fixed in the meanwhile; it's not that easy to 

It has. I sent the patch to Neil Brown a while back to fix this problem. 
I believe it made 2.6.9.

Ok, good to hear.

test codepaths for failing drives with various kernels without having 
access to special block devices which support on-demand-failing.

mdadm -f /dev/md0 <drive>

roughly approximates a drive failure

IIRC this doesn't touch any codepaths which are involved in handling 
unreadable blocks on a block device, rescheduling block reads to another 
drive, etc, so this isn't a real alternative to funky block devices ;).

Furthermore I'm a bit concerned about the overall quality of the md 
support in 2.6

I don't think you should be. md in 2.6 (as of 2.6.9 or so) is as stable 
as 2.4, at least according to our stress tests.

Including semi-dead/dying drives? As I said, normal operation is rock 
solid, it's just the edgy, hardly used stuff which tend(s|ed) to break.

best regards,
michaely
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html