On Mon, 14 May 2012 12:53:00 +0200 Michał Sawicz <michal@xxxxxxxxxx> wrote: > Dnia 2012-05-14, pon o godzinie 20:22 +1000, NeilBrown pisze: > > On Sun, 13 May 2012 20:21:48 +0200 Michał Sawicz <michal@xxxxxxxxxx> wrote: > > > > > Hey, > > > > > > I've a weird issue with a RAID6 setup, /proc/mdstat says: > > > > > > > md126 : active raid6 sda1[3] sdh1[6] sdg1[0](F) sdf1[5] sdi1[1] sdc[8] sdb[7] > > > > 9767559680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [_UUUUUU] > > > > > > So sdg1 is (F)ailed, yet `mdadm --remove` yields: > > > > > > > md: cannot remove active disk sdg1 from md126 ... > > > > There is a period of time between when a device fails and when the raid456 > > module finally lets go of it so it can be removed. You seem to be in this > > period of time. > > Normally it is very short. It needs to wait for any requests that have > > already been sent to the device to complete (probably with failure) and > > very shortly after that it should be released. So this is normally much less > > than one second but could be several seconds is some excessive retry is > > happening. > > > > But I'm guessing you have waited more than a few seconds. > > Yup :) > > > I vaguely recall a bug in the not too distant past whereby RAID456 wouldn't > > let go of a device quite as soon as it should. Unfortunately I don't > > remember the details. You might be able to trigger it to release the drive > > by adding a spare - if you have one - or maybe by just > > echo sync > /sys/block/md126/md/sync_action > > it won't actually do a sync, but it might check things enough to make > > progress. > > # echo sync > /sys/block/md126/md/sync_action > -bash: echo: write error: Device or resource busy Hmmm.... Looks like MD_RECOVERY_NEEDED is already set. But remove_and_add_spares() isn't removing the failed device from the array. I cannot find anything since 2.6.38 that looks like your symptoms. Is the array still functioning? Are there any interesting messages appearing in the kernel logs? What does grep . /sys/block/md126/md/dev*/* show? NeilBrown > > eh? > > > What kernel are you using? > > # uname -a > Linux media 2.6.38-gentoo-r6 #2 SMP Tue Sep 13 19:13:42 CEST 2011 x86_64 > AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux > > Thanks,
Attachment:
signature.asc
Description: PGP signature