On Tue, 7 Jun 2011 00:01:04 -0700 "fibreraid@xxxxxxxxx" <fibreraid@xxxxxxxxx> wrote: > Hello, > > I did test IO, and upon issuing IO, then md correctly detected the > failure and began a rebuild. However, my opinion is that this is > inadequate and actually, I do not believe this is correct behavior. As > I recall from prior experiences with md, md would initiate a rebuild > based on drive removal only as well, even without any pending IO. > > I would appreciate some further feedback as to this behavior. Thanks! MD has never been able to respond to a drive removal - only to an IO error. If you want md to notice when a drive is removed then you need a udev rule to tell it. The rule can run mdadm --incremental --fail devicename where 'device' name is not "/dev/sda" as that won't exist any more, but "sda" which is the kernel-internal name for the device. NeilBrown > > -Tommy > > > On Mon, Jun 6, 2011 at 2:25 PM, CoolCold <coolthecold@xxxxxxxxx> wrote: > > On Mon, Jun 6, 2011 at 10:20 PM, fibreraid@xxxxxxxxx > > <fibreraid@xxxxxxxxx> wrote: > >> Hello, > >> > >> I am running Linux kernel 2.6.38 64-bit version with mdadm 3.2.1. The > >> server hardware has dual socket Westmere CPUs (4 cores each), 24 GB of > >> RAM, and 24 hard drives connected via SAS. > >> > >> I create an md0 array with 23 active drives, 1 hot-spare, RAID 5, and > >> 64K chunk. After synchronization is complete, I have: > >> > >> root::~# cat /proc/mdstat > >> Personalities : [raid6] [raid5] [raid4] > >> md0 : active raid5 sdf1[23](S) sdi1[22] sdh1[21] sdg1[20] sde1[19] > >> sdd1[18] sdc1[17] sdo1[16] sdn1[15] sdq1[14] sdp1[13] sdr1[12] > >> sdm1[11] sdl1[10] sdk1[9] sdj1[8] sdv1[7] sdu1[6] sdt1[5] sds1[4] > >> sdy1[3] sdx1[2] sdb1[1] sdw1[0] > >> 2149005056 blocks super 1.2 level 5, 64k chunk, algorithm 2 > >> [23/23] [UUUUUUUUUUUUUUUUUUUUUUU] > >> > >> Then I remove an active drive from the system by unplugging it. udev > >> catches the event, and fdisk -l reports one less drive. In this case, > >> I remove /dev/sdv. > >> > >> However, /proc/mdstat remains unchanged. It's as if md has no idea > >> that the drive disappeared. I would expect md at this point to have > >> detected the removal, and to have automatically kicked-off a resync > >> using the included hot-spare. But this does not occur. > >> > >> If I then run mdadm -R /dev/md0, in an attempt to "wake up" md, then > >> md does realize the change, and does start the resyncing. > > I guess md realizes there is no drive when write/read error occurs, > > which gonna happen pretty soon if array is in usage, can you set some > > dd reading and then remove drive? > > > >> > >> I do not believe this is normal behavior. Can you advise? > >> > >> Thank you! > >> -Tommy > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > > > > > > > > -- > > Best regards, > > [COOLCOLD-RIPN] > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html