Dnia 2012-05-14, pon o godzinie 20:22 +1000, NeilBrown pisze: > On Sun, 13 May 2012 20:21:48 +0200 Michał Sawicz <michal@xxxxxxxxxx> wrote: > > > Hey, > > > > I've a weird issue with a RAID6 setup, /proc/mdstat says: > > > > > md126 : active raid6 sda1[3] sdh1[6] sdg1[0](F) sdf1[5] sdi1[1] sdc[8] sdb[7] > > > 9767559680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [_UUUUUU] > > > > So sdg1 is (F)ailed, yet `mdadm --remove` yields: > > > > > md: cannot remove active disk sdg1 from md126 ... > > There is a period of time between when a device fails and when the raid456 > module finally lets go of it so it can be removed. You seem to be in this > period of time. > Normally it is very short. It needs to wait for any requests that have > already been sent to the device to complete (probably with failure) and > very shortly after that it should be released. So this is normally much less > than one second but could be several seconds is some excessive retry is > happening. > > But I'm guessing you have waited more than a few seconds. Yup :) > I vaguely recall a bug in the not too distant past whereby RAID456 wouldn't > let go of a device quite as soon as it should. Unfortunately I don't > remember the details. You might be able to trigger it to release the drive > by adding a spare - if you have one - or maybe by just > echo sync > /sys/block/md126/md/sync_action > it won't actually do a sync, but it might check things enough to make > progress. # echo sync > /sys/block/md126/md/sync_action -bash: echo: write error: Device or resource busy eh? > What kernel are you using? # uname -a Linux media 2.6.38-gentoo-r6 #2 SMP Tue Sep 13 19:13:42 CEST 2011 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux Thanks, -- Michał Sawicz <michal@xxxxxxxxxx>
Attachment:
signature.asc
Description: This is a digitally signed message part