Re: mdadm --stop goes off and never comes back?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/19/07, Jon Nelson <jnelson-linux-raid@xxxxxxxxxxx> wrote:
> On 12/19/07, Neil Brown <neilb@xxxxxxx> wrote:
> > On Tuesday December 18, jnelson-linux-raid@xxxxxxxxxxx wrote:
> > > This just happened to me.
> > > Create raid with:
> > >
> > > mdadm --create /dev/md2 --level=raid10 --raid-devices=3
> > > --spare-devices=0 --layout=o2 /dev/sdb3 /dev/sdc3 /dev/sdd3
> > >
> > > cat /proc/mdstat
> > >
> > > md2 : active raid10 sdd3[2] sdc3[1] sdb3[0]
> > >       5855424 blocks 64K chunks 2 offset-copies [3/3] [UUU]
> > >       [==>..................]  resync = 14.6% (859968/5855424)
> > > finish=1.3min speed=61426K/sec
> > >
> > > Some log messages:
> > >
> > > Dec 18 15:02:28 turnip kernel: md: md2: raid array is not clean --
> > > starting background reconstruction
> > > Dec 18 15:02:28 turnip kernel: raid10: raid set md2 active with 3 out
> > > of 3 devices
> > > Dec 18 15:02:28 turnip kernel: md: resync of RAID array md2
> > > Dec 18 15:02:28 turnip kernel: md: minimum _guaranteed_  speed: 1000
> > > KB/sec/disk.
> > > Dec 18 15:02:28 turnip kernel: md: using maximum available idle IO
> > > bandwidth (but not more than 200000 KB/sec) for resync.
> > > Dec 18 15:02:28 turnip kernel: md: using 128k window, over a total of
> > > 5855424 blocks.
> > > Dec 18 15:03:36 turnip kernel: md: md2: resync done.
> > > Dec 18 15:03:36 turnip kernel: md: checkpointing resync of md2.
> > >
> > > I tried to stop the array:
> > >
> > > mdadm --stop /dev/md2
> > >
> > > and mdadm never came back. It's off in the kernel somewhere. :-(
> > >
> > > kill, of course, has no effect.
> > > The machine still runs fine, the rest of the raids (md0 and md1) work
> > > fine (same disks).
> > >
> > > The output (snipped, only mdadm) of 'echo t > /proc/sysrq-trigger'
> > >
> > > Dec 18 15:09:13 turnip kernel: mdadm         S 0001e5359fa38fb0     0
> > > 3943      1 (NOTLB)
> > > Dec 18 15:09:13 turnip kernel:  ffff810033e7ddc8 0000000000000086
> > > 0000000000000000 0000000000000092
> > > Dec 18 15:09:13 turnip kernel:  0000000000000fc7 ffff810033e7dd78
> > > ffffffff80617800 ffffffff80617800
> > > Dec 18 15:09:13 turnip kernel:  ffffffff8061d210 ffffffff80617800
> > > ffffffff80617800 0000000000000000
> > > Dec 18 15:09:13 turnip kernel: Call Trace:
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff803fac96>]
> > > __mutex_lock_interruptible_slowpath+0x8b/0xca
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff802acccb>] do_open+0x222/0x2a5
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff8038705d>] md_seq_show+0x127/0x6c1
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff80275597>] vma_merge+0x141/0x1ee
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff802a2aa0>] seq_read+0x1bf/0x28b
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a42d>] vfs_read+0xcb/0x153
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a7c1>] sys_read+0x45/0x6e
> > > Dec 18 15:09:13 turnip kernel:  [<ffffffff80209c2e>] system_call+0x7e/0x83
> > >
> > >
> > >
> > > What happened? Is there any debug info I can provide before I reboot?
> >
> > Don't know.... very odd.
> >
> > The rest of the 'sysrq' output would possibly help.
>
> Does this help? It's the same syscall and args, I think, as above.
>
> Dec 18 15:09:13 turnip kernel: hald          S 0001e52f4793e397     0
> 3040      1 (NOTLB)
> Dec 18 15:09:13 turnip kernel:  ffff81003aa51e38 0000000000000086
> 0000000000000000 ffffffff802
> 68ee6
> Dec 18 15:09:13 turnip kernel:  ffff81002a97e5c0 ffff81003aa51de8
> ffffffff80617800 ffffffff806
> 17800
> Dec 18 15:09:13 turnip kernel:  ffffffff8061d210 ffffffff80617800
> ffffffff80617800 ffff8100000
> 0bb48
> Dec 18 15:09:13 turnip kernel: Call Trace:
> Dec 18 15:09:13 turnip kernel:  [<ffffffff80268ee6>]
> get_page_from_freelist+0x3c4/0x545
> Dec 18 15:09:13 turnip kernel:  [<ffffffff803fac96>]
> __mutex_lock_interruptible_slowpath+0x8b/
> 0xca
> Dec 18 15:09:13 turnip kernel:  [<ffffffff80387adf>] md_attr_show+0x2f/0x64
> Dec 18 15:09:13 turnip kernel:  [<ffffffff802cd142>] sysfs_read_file+0xb3/0x111
> Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a42d>] vfs_read+0xcb/0x153
> Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a7c1>] sys_read+0x45/0x6e
> Dec 18 15:09:13 turnip kernel:  [<ffffffff80209c2e>] system_call+0x7e/0x83
> Dec 18 15:09:13 turnip kernel:

NOTE: kernel is stock openSUSE 10.3 kernel, x86_64, 2.6.22.13-0.3-default.


-- 
Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux