On Wed, 23 Dec 2009 19:51:33 +0800 "spren.gm@xxxxxxxxx" <spren.gm@xxxxxxxxx> wrote: > Hi, > Is it intended that when a spare disk status gets faulty (detached from raid or really faulty) > synchronization is interrupted ? We found that case several days ago with kernel version of 2.6.24, > after we unplugged a spare disk of a raid5 which had bitmap and was recovering, the spare disk > status became faulty and synchronization restarted from 0%. > > Looking into the md code, i find that in md/md.c/md_error(), it doesn't make a difference between > spare disks and normal disks. Should we make a faulty spare disk not interrupt raid synchronization ? > Disks nowadays have become much larger, and recovering one disk may cost several hours or even longer. > Yes, it is intended that any synchronisation is interrupted when any device fails. However if the device was just an inactive spare, then the synchronisation should start again from the same place that it was up to, it at least it should repeat the already-done part very very quickly. Can you test on a more recent kernel? Can you give precise details of steps and kernel log messages and mdstat output? NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html