On 06/05/2023 17:28, Alex Elder wrote:
I have a 3-drive RAID5 that I set up more than 5 years ago.
At some point, the "middle" drive failed (hardware failure).
I bought a replacement and issued some commands to attempt
to recover, and did *something* wrong. I was too busy to
spend time to fix this then, and one thing led to another,
and now it's 2023 and I'd like to fire the array up again.
The current state is that /dev/md127 gets created automatically
and it contains the two good disks (partitions). I'm pretty
sure this should be easy enough to recover if I issue the
right commands to rebuild things.
The disks are 8TB Seagate Ironwolf drives. (ST8000VN0022-2EL)
They are partitioned identically, with a GPT label. The first
partition starts at offset 2048 sectors and continues to the end.
The RAID device was originally created with this command:
mdadm --create /dev/md/z --level=5 --raid-devices=3 /dev/sd{b,c,d}1
I'm going to provide the output of a few commands below, and
will gladly provide whatever other information is required
to get this fixed.
Thank you.
-Alex
https://raid.wiki.kernel.org/index.php/Linux_Raid
root@meat:/# mdadm --version
mdadm - v4.2 - 2021-12-30 - Ubuntu 4.2-3ubuntu1
root@meat:/#
root@meat:/# mdadm --detail --scan
INACTIVE-ARRAY /dev/md127 metadata=1.2 name=meat:z
UUID=8a021a34:f19bbc01:7bcf6f8e:3bea43a9
root@meat:/#
root@meat:/# mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Raid Level : raid5
Total Devices : 2
Persistence : Superblock is persistent
State : inactive
Working Devices : 2
Name : meat:z (local to host meat)
UUID : 8a021a34:f19bbc01:7bcf6f8e:3bea43a9
Events : 9534
Number Major Minor RaidDevice
- 8 49 - /dev/sdd1
- 8 17 - /dev/sdb1
root@meat:/#
So the raid array doesn't know about sdc.
root@meat:/# ls -l /dev/sd[bcd]1
brw-rw---- 1 root disk 8, 17 May 6 11:20 /dev/sdb1
brw-rw---- 1 root disk 8, 33 May 6 11:20 /dev/sdc1
brw-rw---- 1 root disk 8, 49 May 6 11:20 /dev/sdd1
root@meat:/#
=========== Here's the disk that got replaced
root@meat:/# mdadm --examine /dev/sdc1
mdadm: No md superblock detected on /dev/sdc1.
root@meat:/#
Nor does sdc know about the array.
=========== Here are the other two disks
root@meat:/# mdadm --examine /dev/sd[bd]1
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 8a021a34:f19bbc01:7bcf6f8e:3bea43a9
Name : meat:z (local to host meat)
Creation Time : Sun Oct 22 21:19:23 2017
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 15627786240 sectors (7.28 TiB 8.00 TB)
Array Size : 15627786240 KiB (14.55 TiB 16.00 TB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=0 sectors
State : clean
Device UUID : 4cd855b7:ecba9064:b74a2182:bbc8a994
Internal Bitmap : 8 sectors from superblock
Update Time : Sun Dec 13 16:51:21 2020
Bad Block Log : 512 entries available at offset 40 sectors
Checksum : be977447 - correct
Events : 9534
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
Odd ... although this is possibly because the array is inactive so the
drive hasn't realised sdc has gone AWOL ...
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 8a021a34:f19bbc01:7bcf6f8e:3bea43a9
Name : meat:z (local to host meat)
Creation Time : Sun Oct 22 21:19:23 2017
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 15627786240 sectors (7.28 TiB 8.00 TB)
Array Size : 15627786240 KiB (14.55 TiB 16.00 TB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=0 sectors
State : clean
Device UUID : 1d2ce616:943d0c86:fa9ef8ad:87e31be0
Internal Bitmap : 8 sectors from superblock
Update Time : Sun Dec 13 16:51:21 2020
Bad Block Log : 512 entries available at offset 40 sectors
Checksum : b6ece242 - correct
Events : 9534
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
root@meat:/#
Same again.
Okay, the first thing I would do is add sdc1 again. Does sdc1 actually
exist? Possibly what you did wrong was forget to create it?
What I'm hoping is that this will then trigger a rebuild and everything
will be hunky-dory.
What MIGHT happen, though, is that you end up with an inactive array
with two active drives and a spare. At which point it should simply be a
matter of activating the array (snag is, I'm not sure how). The fact the
array was inactivated is a safety feature - whatever went wrong, md
realised that something wasn't right and inactivated the array to
protect it.
So to sum up, I'm expecting just re-adding sdc1 (because the add clearly
failed last time) will trigger a rebuild, at which point the array will
be clean and should just reactivate.
If that doesn't succeed, it won't do any damage, but will need someone
with more knowledge than me for the final steps.
Cheers,
Wol