Re: mdadm --stop goes off and never comes back?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/19/07, Neil Brown <neilb@xxxxxxx> wrote:
> On Tuesday December 18, jnelson-linux-raid@xxxxxxxxxxx wrote:
> > This just happened to me.
> > Create raid with:
> >
> > mdadm --create /dev/md2 --level=raid10 --raid-devices=3
> > --spare-devices=0 --layout=o2 /dev/sdb3 /dev/sdc3 /dev/sdd3
> >
> > cat /proc/mdstat
> >
> > md2 : active raid10 sdd3[2] sdc3[1] sdb3[0]
> >       5855424 blocks 64K chunks 2 offset-copies [3/3] [UUU]
> >       [==>..................]  resync = 14.6% (859968/5855424)
> > finish=1.3min speed=61426K/sec
> >
> > Some log messages:
> >
> > Dec 18 15:02:28 turnip kernel: md: md2: raid array is not clean --
> > starting background reconstruction
> > Dec 18 15:02:28 turnip kernel: raid10: raid set md2 active with 3 out
> > of 3 devices
> > Dec 18 15:02:28 turnip kernel: md: resync of RAID array md2
> > Dec 18 15:02:28 turnip kernel: md: minimum _guaranteed_  speed: 1000
> > KB/sec/disk.
> > Dec 18 15:02:28 turnip kernel: md: using maximum available idle IO
> > bandwidth (but not more than 200000 KB/sec) for resync.
> > Dec 18 15:02:28 turnip kernel: md: using 128k window, over a total of
> > 5855424 blocks.
> > Dec 18 15:03:36 turnip kernel: md: md2: resync done.
> > Dec 18 15:03:36 turnip kernel: md: checkpointing resync of md2.
> >
> > I tried to stop the array:
> >
> > mdadm --stop /dev/md2
> >
> > and mdadm never came back. It's off in the kernel somewhere. :-(
> >
> > kill, of course, has no effect.
> > The machine still runs fine, the rest of the raids (md0 and md1) work
> > fine (same disks).
> >
> > The output (snipped, only mdadm) of 'echo t > /proc/sysrq-trigger'
> >
> > Dec 18 15:09:13 turnip kernel: mdadm         S 0001e5359fa38fb0     0
> > 3943      1 (NOTLB)
> > Dec 18 15:09:13 turnip kernel:  ffff810033e7ddc8 0000000000000086
> > 0000000000000000 0000000000000092
> > Dec 18 15:09:13 turnip kernel:  0000000000000fc7 ffff810033e7dd78
> > ffffffff80617800 ffffffff80617800
> > Dec 18 15:09:13 turnip kernel:  ffffffff8061d210 ffffffff80617800
> > ffffffff80617800 0000000000000000
> > Dec 18 15:09:13 turnip kernel: Call Trace:
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff803fac96>]
> > __mutex_lock_interruptible_slowpath+0x8b/0xca
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff802acccb>] do_open+0x222/0x2a5
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff8038705d>] md_seq_show+0x127/0x6c1
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff80275597>] vma_merge+0x141/0x1ee
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff802a2aa0>] seq_read+0x1bf/0x28b
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a42d>] vfs_read+0xcb/0x153
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a7c1>] sys_read+0x45/0x6e
> > Dec 18 15:09:13 turnip kernel:  [<ffffffff80209c2e>] system_call+0x7e/0x83
> >
> >
> >
> > What happened? Is there any debug info I can provide before I reboot?
>
> Don't know.... very odd.
>
> The rest of the 'sysrq' output would possibly help.

Does this help? It's the same syscall and args, I think, as above.

Dec 18 15:09:13 turnip kernel: hald          S 0001e52f4793e397     0
3040      1 (NOTLB)
Dec 18 15:09:13 turnip kernel:  ffff81003aa51e38 0000000000000086
0000000000000000 ffffffff802
68ee6
Dec 18 15:09:13 turnip kernel:  ffff81002a97e5c0 ffff81003aa51de8
ffffffff80617800 ffffffff806
17800
Dec 18 15:09:13 turnip kernel:  ffffffff8061d210 ffffffff80617800
ffffffff80617800 ffff8100000
0bb48
Dec 18 15:09:13 turnip kernel: Call Trace:
Dec 18 15:09:13 turnip kernel:  [<ffffffff80268ee6>]
get_page_from_freelist+0x3c4/0x545
Dec 18 15:09:13 turnip kernel:  [<ffffffff803fac96>]
__mutex_lock_interruptible_slowpath+0x8b/
0xca
Dec 18 15:09:13 turnip kernel:  [<ffffffff80387adf>] md_attr_show+0x2f/0x64
Dec 18 15:09:13 turnip kernel:  [<ffffffff802cd142>] sysfs_read_file+0xb3/0x111
Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a42d>] vfs_read+0xcb/0x153
Dec 18 15:09:13 turnip kernel:  [<ffffffff8028a7c1>] sys_read+0x45/0x6e
Dec 18 15:09:13 turnip kernel:  [<ffffffff80209c2e>] system_call+0x7e/0x83
Dec 18 15:09:13 turnip kernel:


-- 
Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux