Thank you, Kuai!
So my gut instinct was not that bad. Now as I could reassemble my raid set (it tried to recontinue the rebuild, I stopped it)
I have a /dev/md0 but it seems that no sensible data is stored on it. Not even a partition table could be found.
From your investigations, what would you say : is there hope I could rescue some of the data from the raidset with a tool
like testdisk, when I "recreate" my old gpt partition table ? Or is it likely that the restarted reshape/grow process made
minced meat out of my whole raid data ?
It seemed interesting to me, that the first grow/shape process seemed to not even touch the two added discs, shown as
spare now, their partition tables had not been touched. The process seems to deal only with my legacy raid 5 set with
six plates and seemed to move it to a transient raid5/6 architecture, therefore operating atleast on the disc (3) of legacy
set, that is now missing..
I'm not sure, how much time to spend in this data is sensible,
your advice could be very helpful.
regards
Peter
Am 04.05.23 um 10:16 schrieb Yu Kuai:
Hi,
在 2023/04/28 5:09, Peter Neuwirth 写道:
Hello linux-raid group.
I have an issue with my linux raid setup and I hope somebody here
could help me get my raid active again without data loss.
I have a debian 11 system with one raid array (6x 1TB hdd drives, raid level 5 )
that was active running till today, when I added two more 1TB hdd drives
and also changed the raid level to 6.
Note: For completition:
My raid setup month ago was
mdadm --create --verbose /dev/md0 -c 256K --level=5 --raid-devices=6 /dev/sdd /dev/sdc /dev/sdb /dev/sda /dev/sdg /dev/sdf
mkfs.xfs -d su=254k,sw=6 -l version=2,su=256k -s size=4k /dev/md0
mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf
update-initramfs -u
echo '/dev/md0 /mnt/data ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
Today I did:
mdadm --add /dev/md0 /dev/sdg /dev/sdh
sudo mdadm --grow /dev/md0 --level=6
This started a growth process, I could observe with
watch -n 1 cat /proc/mdstat
and md0 was still usable all the day.
Due to speedy file access reasons I paused the grow and insertion
process today at about 50% by issue
echo "frozen" > /sys/block/md0/md/sync_action
After the file access was done, I restarted the
process with
echo reshape > /sys/block/md0/md/sync_action
After look into this problem, I figure out that this is how the problem
(corrupted data) triggered in the first place, while the problem that
kernel log about "md: cannot handle concurrent replacement and reshape"
is not fatal.
"echo reshape" will restart the whole process, while recorded reshape
position should be used. This is a seriously kernel bug, I'll try to fix
this soon.
By the way, "echo idle" should avoid this problem.
Thanks,
Kuai
but I saw in mdstat that it started form the scratch.
After about 5 min I noticed, that /dev/dm0 mount was gone with
an input/output error in syslog and I rebooted the computer, to see the
kernel would reassemble dm0 correctly. Maybe the this was a problem,
because the dm0 was still reshaping, I do not know..