Re: power outage while raid5->raid6 was in progress

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I just did as you told:
 * upgrading to 2.6.34.1
 * go back to 5 disk raid5
 * start raid5 -> raid6 again

This time I got no error and it began reshaping:
md0 : active raid6 sdk1[0] sdf1[5] sdg1[6] sdh1[7] sdi1[4] sdj1[3] sdl1[2] sde1[1]
      5860543744 blocks super 0.91 level 6, 64k chunk, algorithm 18 [8/7] [UUUUU_UU]
      [>....................]  reshape =  2.8% (41047020/1465135936) finish=32916.5min speed=721K/sec

The reshaping got stuck again. Checking top I can see ksoftirqd uses 100% CPU on one
of the 2 cores. But there is one difference. This time there are no complains about any
disk device(s). Here is the dmesg output - The lower part (from 'blocked for more than
120 seconds') is repeated multiple times.

[  612.736649] raid5: device sdk1 operational as raid disk 0
[  612.736653] raid5: device sdi1 operational as raid disk 4
[  612.736656] raid5: device sdj1 operational as raid disk 3
[  612.736658] raid5: device sdl1 operational as raid disk 2
[  612.736660] raid5: device sde1 operational as raid disk 1
[  612.737301] raid5: allocated 6386kB for md0
[  612.749007] 0: w=1 pa=0 pr=6 m=2 a=18 r=6 op1=0 op2=0
[  612.749011] 4: w=2 pa=0 pr=6 m=2 a=18 r=6 op1=0 op2=0
[  612.749014] 3: w=3 pa=0 pr=6 m=2 a=18 r=6 op1=0 op2=0
[  612.749016] 2: w=4 pa=0 pr=6 m=2 a=18 r=6 op1=0 op2=0
[  612.749018] 1: w=5 pa=0 pr=6 m=2 a=18 r=6 op1=0 op2=0
[  612.749021] raid5: raid level 6 set md0 active with 5 out of 6 devices, algorithm 18
[  612.749065] RAID5 conf printout:
[  612.749067]  --- rd:6 wd:5
[  612.749069]  disk 0, o:1, dev:sdk1
[  612.749071]  disk 1, o:1, dev:sde1
[  612.749073]  disk 2, o:1, dev:sdl1
[  612.749075]  disk 3, o:1, dev:sdj1
[  612.749077]  disk 4, o:1, dev:sdi1
[  612.779095] RAID5 conf printout:
[  612.779099]  --- rd:8 wd:7
[  612.779102]  disk 0, o:1, dev:sdk1
[  612.779104]  disk 1, o:1, dev:sde1
[  612.779105]  disk 2, o:1, dev:sdl1
[  612.779107]  disk 3, o:1, dev:sdj1
[  612.779109]  disk 4, o:1, dev:sdi1
[  612.779111]  disk 5, o:1, dev:sdf1
[  612.779117] RAID5 conf printout:
[  612.779118]  --- rd:8 wd:7
[  612.779120]  disk 0, o:1, dev:sdk1
[  612.779122]  disk 1, o:1, dev:sde1
[  612.779123]  disk 2, o:1, dev:sdl1
[  612.779125]  disk 3, o:1, dev:sdj1
[  612.779127]  disk 4, o:1, dev:sdi1
[  612.779129]  disk 5, o:1, dev:sdf1
[  612.779130]  disk 6, o:1, dev:sdg1
[  612.779134] RAID5 conf printout:
[  612.779135]  --- rd:8 wd:7
[  612.779136]  disk 0, o:1, dev:sdk1
[  612.779138]  disk 1, o:1, dev:sde1
[  612.779140]  disk 2, o:1, dev:sdl1
[  612.779142]  disk 3, o:1, dev:sdj1
[  612.779143]  disk 4, o:1, dev:sdi1
[  612.779145]  disk 5, o:1, dev:sdf1
[  612.779147]  disk 6, o:1, dev:sdg1
[  612.779149]  disk 7, o:1, dev:sdh1
[  612.779233] md: reshape of RAID array md0
[  612.779235] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  612.779237] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[  612.779244] md: using 128k window, over a total of 1465135936 blocks.
[ 2846.784515] Clocksource tsc unstable (delta = 4398042075725 ns)
[ 3000.536656] INFO: task md0_reshape:5554 blocked for more than 120 seconds.
[ 3000.536675] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3000.536699] md0_reshape   D 0000000100099eb1     0  5554      2 0x00000000
[ 3000.536704]  ffff88007e08f700 0000000000000046 0000000000000082 ffff88006d3d1290
[ 3000.536707]  ffffffff81632020 0000000000015280 0000000000015280 0000000000015280
[ 3000.536711]  ffff88007e1f3fd8 0000000000015280 ffff88007e08f700 ffff88007e1f3fd8
[ 3000.536714] Call Trace:
[ 3000.536735]  [<ffffffffa019e735>] ? reshape_request+0x1e7/0x7a8 [raid456]
[ 3000.536741]  [<ffffffff8105d519>] ? autoremove_wake_function+0x0/0x2a
[ 3000.536746]  [<ffffffffa019ee87>] ? sync_request+0x191/0x2d1 [raid456]
[ 3000.536754]  [<ffffffffa0180a0b>] ? is_mddev_idle+0xa2/0xf5 [md_mod]
[ 3000.536760]  [<ffffffffa0181171>] ? md_do_sync+0x713/0xb0f [md_mod]
[ 3000.536764]  [<ffffffff8103767e>] ? update_curr+0xa2/0x145
[ 3000.536771]  [<ffffffffa0181d4a>] ? md_thread+0xf2/0x110 [md_mod]
[ 3000.536777]  [<ffffffffa0181c58>] ? md_thread+0x0/0x110 [md_mod]
[ 3000.536779]  [<ffffffff8105d0e5>] ? kthread+0x75/0x7d
[ 3000.536783]  [<ffffffff810097e4>] ? kernel_thread_helper+0x4/0x10
[ 3000.536786]  [<ffffffff8105d070>] ? kthread+0x0/0x7d
[ 3000.536788]  [<ffffffff810097e0>] ? kernel_thread_helper+0x0/0x10

-- Sebastian

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux