On Tue, Oct 10 2017, Curt wrote: >> >> Just --freeze-reshape, not --update. >> > Ok, here's the output > mdadm --detail /dev/md127 > /dev/md127: > Version : 0.91 > Creation Time : Fri Jun 15 15:52:05 2012 > Raid Level : raid6 > Array Size : 9767519360 (9315.03 GiB 10001.94 GB) > Used Dev Size : 1953503872 (1863.01 GiB 2000.39 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 127 > Persistence : Superblock is persistent > > Update Time : Tue Oct 10 15:11:26 2017 > State : clean, FAILED, reshaping > Active Devices : 5 > Working Devices : 6 > Failed Devices : 0 > Spare Devices : 1 > > Layout : left-symmetric > Chunk Size : 64K > > Consistency Policy : unknown > > Reshape Status : 0% complete > Delta Devices : 1, (7->8) > > UUID : 714a612d:9bd35197:36c91ae3:c168144d > Events : 0.11559682 > > Number Major Minor RaidDevice State > 0 8 97 0 active sync /dev/sdg1 > 1 8 49 1 active sync /dev/sdd1 > 2 8 33 2 active sync /dev/sdc1 > 3 8 1 3 active sync /dev/sda1 > 4 65 145 4 active sync /dev/sdz1 > - 0 0 5 removed > 6 8 16 6 spare rebuilding /dev/sdb > - 0 0 7 removed > > But in my dmesg, I'm seeing task md127_reshape blocked for 120 > seconds, and when I cat sync_action, it shows reshape. Which > shouldn't it be frozen or something like that? Also md127_raid6 task > is using 100% cpu. I was going to paste the assemble output, but hit > clear instead of copy. It didn't show any errors I saw, just starting > with 6 drives. reshape isn't using any cpu > > If I do a cat of /proc/pid/stack, all I get is > [<ffffffffffffffff>] 0xffffffffffffffff > > Should I just let it run? Clearly a kernel bug. What kernel are you using? Can you try a newer one easily? Can you please mkdir /tmp/dump mdadm --dump=/dev/dump /dev...list.all.devices.in.the.array tar czf --sparse /tmp/dump.tgz /tmp/dump and send me /tmp/dump.tgz. It will only contains the metadata. I can then create an identical looking array and experiment. I doubt if letting it run will bring benefits. NeilBrown
Attachment:
signature.asc
Description: PGP signature