On Wednesday December 19, jnelson-linux-raid@xxxxxxxxxxx wrote: > On 12/19/07, Jon Nelson <jnelson-linux-raid@xxxxxxxxxxx> wrote: > > On 12/19/07, Neil Brown <neilb@xxxxxxx> wrote: > > > On Tuesday December 18, jnelson-linux-raid@xxxxxxxxxxx wrote: > > > > > > > > I tried to stop the array: > > > > > > > > mdadm --stop /dev/md2 > > > > > > > > and mdadm never came back. It's off in the kernel somewhere. :-( Looking at your stack traces, you have the "mdadm -S" holding an md lock and trying to get a sysfs lock as part of tearing down the array, and 'hald' is trying to read some attribute in /sys/block/md.... and is holding the sysfs lock and trying to get the md lock. A classic AB-BA deadlock. > > NOTE: kernel is stock openSUSE 10.3 kernel, x86_64, 2.6.22.13-0.3-default. > It is fixed in mainline with some substantial changes to sysfs. I don't imagine they are likely to get back ported to openSUSE, but you could try logging a bugzilla if you like. The 'hald' process is interruptible and killing it would release the deadlock. I suspect you have to be fairly unlucky to lose the race but it is obviously quite possible. I don't think there is anything I can do on the md side to avoid the bug. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html