I either have more info or a totally different scenario. I initiated a raid5->raid6 reshape on a different machine. At the same time I (perhaps stupidly) ran resize2fs to shrink the ext4 fs on the array being reshaped. The reshape is going slowly (as I would expect) but the resize is nearly dead. It is only able to write a single 4k block to the array about every 5-6 seconds. If that's expected then sorry for the noise and please ignore the rest. Otherwise... When I tried reducing sync_speed_min to 1000 the resize2fs write interval increased to once per 8-10s. When I lowered sync_speed_max to 1000 I saw no more writes until I set the min/max back to 500000 at which point iostat reported a w_await time roughly equal to the time the array had the lower max. The '# echo t > /proc/sysrq-trigger ' output seems to indicate that resize2fs is stuck doing an fsync. The full dump is attached. Here's an iostat showing 100% utilization of the LVM volume and the 4k block writes. Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 1.00 0.00 4.00 8.00 1.00 6200.00 0.00 6200.00 1000.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 1.00 0.00 4.00 8.00 1.00 5450.00 0.00 5450.00 1000.00 100.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] md10 : active raid5 md11[3] sdo1[1] sdn1[0] 1250259200 blocks super 1.2 level 5, 128k chunk, algorithm 2 [3/3] [UUU] md11 : active raid0 sdm1[1] sdad1[0] 625138176 blocks super 1.2 128k chunks md5 : active raid5 sdq1[6] sdp1[4] sdf1[3] sde1[2] sdd1[1] sdc1[0] 1220981760 blocks super 1.2 level 5, 128k chunk, algorithm 2 [6/6] [UUUUUU] md3 : active raid6 sdy2[4] md4[2] sdx3[1] sdt3[0] 1875411968 blocks super 1.2 level 6, 128k chunk, algorithm 2 [4/3] [UUU_] resync=DELAYED md2 : active raid6 sdy1[11] sdz2[0] sdt2[10] sdx2[9] sdu2[8] sds2[7] sdw2[6] sdv2[5] sdr2[4] sdac2[3] sdaa2[2] sdab2[1] 6641912320 blocks super 0.91 level 6, 128k chunk, algorithm 18 [12/11] [UUUUUUUUUUU_] [===============>.....] reshape = 77.4% (514551296/664191232) finish=1030.3min speed=2420K/sec md1 : active raid5 sdj1[0] sdu1[14] sdx1[13] sdt1[12] sdv1[11] sdw1[10] sdh1[9] sdac1[8] sdaa1[7] sdr1[6] sdab1[5] sdz1[4] sdk1[3] sds1[2] sdl1[1] 4375960064 blocks level 5, 64k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU] md4 : active raid0 sdi1[1] sdg1[0] 976769024 blocks super 1.2 128k chunks md0 : active raid1 sdb1[1] sda1[0] 41941944 blocks super 1.2 [2/2] [UU] unused devices: <none> Kernel: 3.2.9-1.fc16.x86_64 mdadm: v3.2.3 resize2fs: 1.41.14 --Larkin On 1/8/2012 6:26 PM, NeilBrown wrote: > On Sun, 08 Jan 2012 16:03:10 -0600 Larkin Lowrey <llowrey@xxxxxxxxxxxxxxxxx> > wrote: > >> Suggestions? > > # echo t > /proc/sysrq-trigger > > and capture that messages that go to 'dmesg'. Post them. > > Hopefully your message ring buffer is big enough to collect the entire > output. If it isn't you might need to boot with > log_buf_len=1M > or similar. > > That should show what process is blocking on what. > > NeilBrown
<<attachment: resize2fs_hang.zip>>