Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Given what the array is reporting I would doubt that is going to fix
anything.    The array being in the middle of a reshape makes it
likely that neither n or n-1 is the right raid config for at least 1/2
of the data, so it is likely the filesystem will be completely broken.

Right now the array reports it is a 5 disk array, and the array data
says it was going from 4 disks to 5.

What was the command you used to add the 4th disk?     No one is sure
based on what you are saying how exactly the array got into this
state.   The data being shown disagrees with what you are reporting,
and given that no one knows what actually happened.

On Mon, May 22, 2023 at 3:18 PM raid <raid@electrons.cloud> wrote:
>
> Hi
>
> Thanks for your time so far ! Final questions before I rebuild this RAID from scratch.
>
> BTW I created detailed notes when I created this array (as I have for eight other RAIDs that I maintain).
>     These notes may be applicable later... Here's why.
>
> Do you think that Zero'ing the drives (as is done for initial drive prep) and then recreating the
> RAID5 using the initial settings (originally three drives, NOW four drives) could possibly offer
> a greater chance to recover files? As in, more complete file recovery if the striping aligns
> correctly? Technically, I've had to write off the files that aren't currently backed up.
>
> However, I'm still willing to make an attempt if you think the idea above might yield something
> better than one or two stripes of data per file?
>
> And/Or any other tips for this final attempt? Setting ReadOnly if possible?
>
> Thanks Again
> SA
>
> ---
> Detailed Notes:
> ============================================================
>   2021.10.26 0200P NEW RAID MD480 (48TB) 3x 1600GB HITACHI
> ========================================================================================================================
> = PREPARATION ==
>
> watch -c -d -n 1 cat /proc/mdstat  ############## OPEN A TERMINAL AND MONITOR STATUS ##
>
> sudo lsblk && sudo blkid  ########################################### VERIFY DEVICES ##
>
> sudo umount /MEGARAID                         # Unmount if filesystem is mounted
> sudo mdadm --stop /dev/md480                  # Stop the RAID/md480 device
> sudo mdadm --zero-superblock /dev/sd[cdf]1    # Zero  the   superblock(s)  on
>                                               #      all members of the array
> sudo mdadm --remove /dev/md480                # Remove the RAID/md480
>
> Edit  ########################################## OPTIONAL FINALIZE PERMANENT REMOVAL ##
> /etc/fstab
> /etc/mdadm/mdadm.conf
> Removing referrences to the mounting and the definition of the RAID/MD480 device(s)
> NOTE: Some fstab CFG settings allow skipping devices when unavailable at boot. (nofail)
>
> sudo update-initramfs -uv       # -uv  update ; verbose  ########### RESET INITRAMFS ##
>
> ======================================================================================== CREATE RAID & ADD FILESYSTEM ==
>   MEGARAID 2021.10.26 0200P
> ##############  RAID5 ARRAY MD480 32TB (32,001,527,644,160 bytes) Available (3x16TB) ##
>
> sudo mdadm --create --verbose /dev/md480 --level=5 --raid-devices=3 --uuid=2021102502005a7a5a7abeefcafebabe
> /dev/sd[cdf]1
>
> 31,251,491,840 BLOCKS CREATED IN ~20 HOURS
>
> ############################################################  CREATE FILESYSTEM EXT4 ##
>  -v VERBOSE
>  -L DISK LABEL
>  -U UUID FORMATTED AS 8CHARS-4CHARS-4CHARS-4CHARS-12CHARS
>  -m OVERFLOW PROTECTION PERCENTAGE IE. .025 OF 24,576GB IS ~615MB FREE IS CONSIDERED FULL
>  -b BLOCK SIZE 1/4 OF STRIDE= OFFERS BEST OVERALL PERFORMANCE
>  -E STRIDE= MULTIPLE OF 8
>     STRIPE-WIDTH= STRIDE X 2
>
> sudo mkfs.ext4 -v -L MEGARAID    -U 20211028-0500-5a7a-5a7a-beefcafebabe -m .025 -b 4096 -E stride=32,stripe-width=64
> /dev/md480
>
> sudo mkdir  /MEGARAID  ; sudo chown adminx:adminx -R /MEGARAID
>
> ##############################################################  SET CORRECT HOMEHOST ##
>
> sudo umount /MEGARAID
> sudo mdadm --stop /dev/md480
> sudo mdadm --assemble --update=homehost --homehost=GRANDSLAM /dev/md480 /dev/sd[cdf]1
> sudo blkid
>
> /dev/sdc1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
>            UUID_SUB="8f0835db-3ea2-4540-2ab4-232d6203d1b7"
>            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
>            PARTLABEL="HIT*16TB*001*RAID5"
>            PARTUUID="3b68fe63-35d0-404d-912e-dfe1127f109b"
>
> /dev/sdd1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
>            UUID_SUB="b4660f49-867b-9f1e-ecad-0acec7119c37"
>            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
>            PARTLABEL="HIT*16TB*002*RAID5"
>            PARTUUID="32c50f4f-f6ce-4309-b8e4-facdb6e05ba8"
>
> /dev/sdf1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
>            UUID_SUB="79a3dff4-c53f-9071-f9c1-c262403fbc10"
>            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
>            PARTLABEL="HIT*16TB*003*RAID5"
>            PARTUUID="7ec27f96-2275-4e09-9013-ac056f11ebfb"
>
> /dev/md480: LABEL="MEGARAID" UUID="20211028-0500-5a7a-5a7a-beefcafebabe" TYPE="ext4"
>
> ############################################################### ENTRY FOR /ETC/FSTAB ##
>
> /dev/md480              /MEGARAID               ext4            nofail,noatime,nodiratime,relatime,errors=remount-ro
> 0               2
>
> #################################################### ENTRY FOR /ETC/MDADM/MDADM.CONF ##
>
> ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480 UUID=20211025:02005a7a:5a7abeef:cafebabe
>
> #######################################################################################
>
> sudo update-initramfs -uv       # -uv  update ; verbose
> sudo mount -a
> sudo chown adminx:adminx -R /MEGARAID
>
> ############################################################### END 2021.10.28 0545A ##
>
>
>
>
>
>
> On Mon, 2023-05-22 at 15:51 +0800, Yu Kuai wrote:
> > Hi,
> >
> > 在 2023/05/22 14:56, raid 写道:
> > > Hi,
> > > Thanks for the guidance as the current state has at least changed somewhat.
> > >
> > > BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.
> > >
> > > -sudo mdadm -I /dev/sdc1
> > > mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 1
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 1
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -I /dev/sdd1
> > > mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 2
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 2
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -I /dev/sde1
> > > mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 3
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 3
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78712
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       65        -        /dev/sde1
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -I /dev/sdf1
> > > mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 4
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 4
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       81        -        /dev/sdf1
> > >         -       8       65        -        /dev/sde1
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -R /dev/md480
> > > mdadm: failed to start array /dev/md480: Input/output error
> > > ---
> > > NOTE: Of additional interest...
> > > ---
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >       Creation Time : Tue Oct 26 14:06:53 2021
> > >          Raid Level : raid5
> > >       Used Dev Size : 18446744073709551615
> > >        Raid Devices : 5
> > >       Total Devices : 3
> > >         Persistence : Superblock is persistent
> > >
> > >         Update Time : Thu May  4 14:39:03 2023
> > >               State : active, FAILED, Not Started
> > >      Active Devices : 3
> > >     Working Devices : 3
> > >      Failed Devices : 0
> > >       Spare Devices : 0
> > >
> > >              Layout : left-symmetric
> > >          Chunk Size : 512K
> > >
> > > Consistency Policy : unknown
> > >
> > >       Delta Devices : 1, (4->5)
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice State
> > >         -       0        0        0      removed
> > >         -       0        0        1      removed
> > >         -       0        0        2      removed
> > >         -       0        0        3      removed
> > >         -       0        0        4      removed
> > >
> > >         -       8       81        3      sync   /dev/sdf1
> > >         -       8       49        1      sync   /dev/sdd1
> > >         -       8       33        0      sync   /dev/sdc1
> >
> > So the reason that this array can't start is that /dev/sde1 is not
> > recognized as RaidDevice 2, and there are two RaidDevice missing for
> > a raid5.
> >
> > Sadly I have no idea to workaroud this, sb metadate seems to be broken.
> >
> > Thanks,
> > Kuai
> > > ---
> > > -watch -c -d -n 1 cat /proc/mdstat
> > > ---
> > > Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24 2023
> > >
> > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> > > md480 : inactive sdf1[4] sdd1[1] sdc1[0]
> > >        46877239294 blocks super 1.2
> > >
> > > unused devices: <none>
> > > ---
> > > Hopeful that is some progress towards an array start? It's definately unexpected output to me.
> > > I/O Error starting md480
> > >
> > > Thanks!
> > > SA
> > >
> > > On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> > >
> > > > I have no idle why other disk shows that device 2 is missing, and what
> > > > is device 4.
> > > >
> > > > Anyway, can you try the following?
> > > >
> > > > mdadm -I /dev/sdc1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > mdadm -I /dev/sdd1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > mdadm -I /dev/sde1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > mdadm -I /dev/sdf1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > If above works well, you can try:
> > > >
> > > > mdadm -R /dev/mdxxx, and see if the array can be started.
> > > >
> > > > Thanks,
> > > > Kuai
> > >
> > >
> > >
> > > .
> > >
>




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux