On 16/01/2023 11:12, Tiago Afonso wrote:
Hi,
Long story short, the bigger story is here:
https://forum.openmediavault.org/index.php?thread/45829-raid-5-growing-hanged-at-0-0/
I had a 4x4TB raid5 configuration and I added a new 4TB drive in order
to grow the array as 5x4TB raid5. I did this via openmediavault GUI
(my mistake, maybe I should have checked tutorials and more info on
how to do it properly). The reshape started but hung right at the
beginning.
root@openmediavault:~# cat /proc/mdstat Personalities : [raid6]
[raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] md127 :
active raid5 sdh[4] sdg[7] sdf[6] sde[5] sdi[8] 11720661504 blocks
super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
[>....................] reshape = 0.0% (1953276/3906887168)
finish=2008857.0min speed=32K/sec bitmap: 6/30 pages [24KB], 65536KB
chunk
I forced shutdown the process by pulling the power, as it was not
letting me stop the process otherwise. Even put the disks in another
machine to no avail. The problem seems to be that the array was not
healthy and there were bad-blocks, and I think that is why it keeps
stopping the reshape. The bad-blocks are in two of the HDDs and it
seems they are the same block. Now I'm stuck. I'm unable to reshape
back to 4 HDDs (I did try this) or access the data.
How can I get out of this situation and recover the data back even if
not all data?
I read the wiki, but I don't have much experience in linux or raid.
I'm afraid of doing something that puts me in an even worse situation.
Attached are some commands output.
Thank you.
Thanks. Looks like you've done your homework :-)
I was about to say this looks like a well-known problem, and then I saw
the mdadm version and the kernel. You should not be getting that problem
with something this up-to-date.
The good news is this still looks like that problem, but the question is
what on earth is going on. Can you boot into a rescue disk rather than
the mediavault stuff?
The array is clean but degraded. Not knowing what you've done, I'm not
sure how to put the array together again, but the really good news is it
looks like - because the reshape never started - it shouldn't be a hard
job to retrieve everything. Then we can start sorting things out from there.
I'm going to bring in the two experts - they might take a little while
to respond - but in the meantime two points to ponder ...
(1) raid badblocks are a mis-feature, as soon as we get the array back,
we want them gone.
(2) "SCT Error Recovery not supported" - Your blues are not suitable
raid drives. I've got Barracudas, and you shouldn't use those in raid
either, so it's not immediately serious, but you want something like a
Seagate IronWolf, or Toshiba N300. I avoid WD for precisely this reason
- they tend to advertise stuff as suitable when it isn't.
If you can afford it (they're not cheap) you might consider getting four
decent raid drives, backing up your existing drives, and then fixing it
when the experts chime in. I'm guessing an "assemble no-bad-blocks
force" will fix everything, but I'm hesitant to say "go ahead and try it".
Cheers,
Wol