Hello again, On 17.07.2018 15:04, Christian wrote: > Hello all, > > today I tried to grow a RAID 5 from 4 to 5 disks but failed to do so. > The additional disk was prepared by writing a gpt disklabel using parted on it, creating a primary partition and setting > raid to "on". > > Then I added it to the array and tried to grow it using > # mdadm /dev/md0 --add /dev/sdc1 > # mdadm --grow --raid-devices=5 /dev/md0 --backup-file=/root/md0.bak > > /proc/mdstat showed an increasing ETA with 0kb/s progress. > Then I noticed, that I accidentially inserted the new disk into the server on-line, only rebooting the virtual machine > (which has direct access to the SATA-controller, as it is passed through) and not the host... > After that I rebooted the VM-Host (XenServer). The VM didn't boot, because it failed to mount a filesystem, which is > located on a LVM which is on top of the RAID. > > After disabling the fstab entry, the system booted up again. But now the md0 gets assembles with all devices marked as > spares. > > Trying to reassemble the RAID fails: > # mdadm --assemble --scan > mdadm: Failed to restore critical section for reshape, sorry. > Possibly you needed to specify the --backup-file > > I tried passing the /root/md0.bak file as --backup-file, but it still doesn't work. > > From that point on, all actions were performed on a snapshot of the underlying partitions. > > Maybe somebody is able to guide me how to reassemble my array without losing too much data? > > Thanks in advance, > > Christian > > > > Some diagnostic output follows: > [...] I finally found a solution after investigating further. The backup file consists of about 6.1MiB zeroes. That's why I skipped the --backup-file parameter for the following commands. # mdadm --assemble /dev/md0 --force --verbose --invalid-backup /dev/sda1 /dev/sdd1 /dev/sde1 /dev/sdb1 /dev/sdc1 This command resulted in the following message: mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument The syslog contained the following line: md/raid:md0: reshape_position too early for auto-recovery - aborting. That led me to the solution to revert the grow command: # mdadm --assemble /dev/md0 --force --verbose --update=revert-reshape --invalid-backup /dev/sda1 /dev/sdd1 /dev/sde1 /dev/sdb1 /dev/sdc1 Growing the RAID with the initial command fails even after removing the new device from the array (overwriting the disks first 4MB after removal) and re-adding it. Syslog gives a hint for the failures reason (all messages logged within the same second): kernel: [68111.425022] RAID conf printout: kernel: [68111.425027] --- level:5 rd:5 wd:5 kernel: [68111.425043] disk 0, o:1, dev:sda1 kernel: [68111.425044] disk 1, o:1, dev:sdd1 kernel: [68111.425045] disk 2, o:1, dev:sde1 kernel: [68111.425045] disk 3, o:1, dev:sdb1 kernel: [68111.425046] disk 4, o:1, dev:sdc1 kernel: [68111.425358] md: reshape of RAID array md0 kernel: [68111.425359] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. kernel: [68111.425359] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. kernel: [68111.425362] md: using 128k window, over a total of 5860391424k. systemd[1]: Created slice system-mdadm\x2dgrow\x2dcontinue.slice. systemd[1]: Started Manage MD Reshape on /dev/md0. systemd[1]: mdadm-grow-continue@md0.service: Main process exited, code=exited, status=2/INVALIDARGUMENT systemd[1]: mdadm-grow-continue@md0.service: Unit entered failed state. systemd[1]: mdadm-grow-continue@md0.service: Failed with result 'exit-code'. These lines led me to this [1] Debian Bug report. Removing the --backup-file parameter let me grow the array as intended - problem solved. Regards, Christian [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=884719 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html