Re: mdadm --stop goes off and never comes back?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/22/07, Neil Brown <neilb@xxxxxxx> wrote:
> On Wednesday December 19, jnelson-linux-raid@xxxxxxxxxxx wrote:
> > On 12/19/07, Jon Nelson <jnelson-linux-raid@xxxxxxxxxxx> wrote:
> > > On 12/19/07, Neil Brown <neilb@xxxxxxx> wrote:
> > > > On Tuesday December 18, jnelson-linux-raid@xxxxxxxxxxx wrote:
> > > > >
> > > > > I tried to stop the array:
> > > > >
> > > > > mdadm --stop /dev/md2
> > > > >
> > > > > and mdadm never came back. It's off in the kernel somewhere. :-(
>
> Looking at your stack traces, you have the "mdadm -S" holding
> an md lock and trying to get a sysfs lock as part of tearing down the
> array, and 'hald' is trying to read some attribute in
>    /sys/block/md....
> and is holding the sysfs lock and trying to get the md lock.
> A classic AB-BA deadlock.
>
> >
> > NOTE: kernel is stock openSUSE 10.3 kernel, x86_64, 2.6.22.13-0.3-default.
> >
>
> It is fixed in mainline with some substantial changes to sysfs.
> I don't imagine they are likely to get back ported to openSUSE, but
> you could try logging a bugzilla if you like.

Nah - I'm eagerly awaiting new kernels anyway as I have some network
cards that work much better (read: they work) with 2.6.24rc3+.

> The 'hald' process is interruptible and killing it would release the
> deadlock.

Cool.

> I suspect you have to be fairly unlucky to lose the race but it is
> obviously quite possible.

Sometimes we are all a little unlucky. In my case, it cost me a reboot
or, in others, nothing at all. Fortunately this was not a production
system with lots of users.

> I don't think there is anything I can do on the md side to avoid the
> bug.

In the situation I don't think that such a change would be warranted anyway.
Thanks again for looking at this. I'm a big believer in the 'canary in
a coal mine' mentality - some problems may indications of much more
serious issues, but in this case, it would appear that the issue has
already been taken care of. Have a Happy Holidays.

-- 
Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux