On Sun, 5 Jun 2011 22:41:55 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx> wrote: > Hello everybody, > I am testing a scenario, in which I create a RAID5 with three devices: > /dev/sd{a,b,c}. Since I don't supply --force to mdadm during creation, > it treats the array as degraded and starts rebuilding the sdc as a > spare. This is as documented. > > Then I do --fail on /dev/sda. I understand that at this point my data > is gone, but I think should still be able to tear down the array. > > Sometimes I see that /dev/sda is kicked from the array as faulty, and > /dev/sdc is also removed and marked as a spare. Then I am able to tear > down the array. > > But sometimes, it looks like the system hits some kind of a deadlock. I cannot reproduce this, either on current mainline or 2.6.38. I didn't try the particular Ubuntu kernel that you mentioned as I don't have any Ubuntu machines. It is unlikely that Ubuntu have broken something, but not impossible... are you able to compile a kernel.org kernel (preferably 2.6.39) and see if you can reproduce. Also, can you provide a simple script that will trigger the bug reliably for you. I did: while : ; do mdadm -CR /dev/md0 -l5 -n3 /dev/sd[abc] ; sleep 5; mdadm /dev/md0 -f /dev/sda ; mdadm -Ss ; echo ; echo; done and it has no problems at all. Certainly a deadlock shouldn't be happening... From the stack trace you get it looks like it is probably hanging at wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active)); which suggests that so resync request started and didn't complete. I've never seen a hang there before. NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html