Re: RAID6 recovery, event count mismatch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>>>> "Linus" == Linus Lüssing <linus.luessing@xxxxxxxxx> writes:

> I recently had a disk failure in my Linux RAID6 array with 6
> devices. The badblocks tool confirmed some broken sectors. I tried
> to remove the faulty drive but that seems to have caused more
> issues (2.5" inch "low power" drives connected via USB-SATA
> adapters over an externally powered USB hub to a Pi 4... which
> ran fine for more than a year now, but seems to be prone to
> physical disk reconnects).

Yikes!  I'm amazed you haven't had more problems with this setup.  It
must be pretty darn slow...

> I removed the faulty drive, rebooted the whole system and the RAID
> is now inactive. The event count is a little old on 3 of 5 disks,
> off by 3 to 7 events).

That implies to me that when you removed the faulty drive, the array
(USB bus, etc) went south at the same time.   


> Question 1)

> Is it safe and still recommended to use the command that is
> suggested here?

> https://raid.wiki.kernel.org/index.php/RAID_Recovery#Trying_to_assemble_using_--force
-> "mdadm /dev/mdX --assemble --force <list of devices>"

I would try this first. Do you have an details of the individual
drives and their counts as well?  

> Or should I do a forced --re-add of the three devices that have
> the lower even counts and a "Device Role : spare"?

> Question 2)

> If a forced re-add/assemble works and a RAID check / rebuilt runs
> through fine, is everything fine again then? Or are there additional
> steps I should follow to ensure the data and filesystems are
> not corrupted? (below I'm using LVM with normal and thinly
> provisioned volumes with LXD for containers, and other than that
> volumes are formatted with ext4)

> Question 3)

> Would the "new" Linux RAID write journal feature with a dedicated
> SSD have prevented such an inconsistency?

> Question 4)

> "mdadm -D /dev/md127" says "Raid Level : raid0", which is wrong
> and luckily the disks themselves each individually still know
> it's a raid6 according to mdadm. Is this just a displaying bug
> of mdadm and nothing to worry about?

> System/OS:

> $ uname -a
> Linux treehouse 5.18.9-v8+ #4 SMP PREEMPT Mon Jul 11 02:47:28 CEST 2022 aarch64 GNU/Linux
> $ mdadm --version
> mdadm - v4.1 - 2018-10-01
> $ cat /etc/debian_version
> 11.5
-> Debian bullseye


> More detailed mdadm output below.

> Regards, Linus


> ==========

> $ cat /proc/mdstat 
> Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
> md127 : inactive dm-13[6](S) dm-12[5](S) dm-11[3](S) dm-10[2](S) dm-9[0](S)
>       9762371240 blocks super 1.2
       
> unused devices: <none>

> $ mdadm -D /dev/md127
> /dev/md127:
>            Version : 1.2
>         Raid Level : raid0
>      Total Devices : 5
>        Persistence : Superblock is persistent

>              State : inactive
>    Working Devices : 5

>               Name : treehouse:raid  (local to host treehouse)
>               UUID : cc2852b8:aca4bdf8:761739d6:0ca5c3bb
>             Events : 2554495

>     Number   Major   Minor   RaidDevice

>        -     254       13        -        /dev/dm-13
>        -     254       11        -        /dev/dm-11
>        -     254       12        -        /dev/dm-12
>        -     254       10        -        /dev/dm-10
>        -     254        9        -        /dev/dm-9

> $ mdadm -E /dev/dm-9
> /dev/dm-9:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : cc2852b8:aca4bdf8:761739d6:0ca5c3bb
>            Name : treehouse:raid  (local to host treehouse)
>   Creation Time : Mon Jan 29 02:48:26 2018
>      Raid Level : raid6
>    Raid Devices : 6

>  Avail Dev Size : 3904948496 (1862.02 GiB 1999.33 GB)
>      Array Size : 7809878016 (7448.08 GiB 7997.32 GB)
>   Used Dev Size : 3904939008 (1862.02 GiB 1999.33 GB)
>     Data Offset : 252928 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=252840 sectors, after=9488 sectors
>           State : clean
>     Device UUID : 5fa00c38:e4069502:d4013eeb:08801a9b

> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sat Nov 26 09:59:17 2022
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 1a214e3c - correct
>          Events : 2554492

>          Layout : left-symmetric
>      Chunk Size : 512K

>    Device Role : spare
>    Array State : ...A.A ('A' == active, '.' == missing, 'R' == replacing)
> $ mdadm -E /dev/dm-10
> /dev/dm-10:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : cc2852b8:aca4bdf8:761739d6:0ca5c3bb
>            Name : treehouse:raid  (local to host treehouse)
>   Creation Time : Mon Jan 29 02:48:26 2018
>      Raid Level : raid6
>    Raid Devices : 6

>  Avail Dev Size : 3904948496 (1862.02 GiB 1999.33 GB)
>      Array Size : 7809878016 (7448.08 GiB 7997.32 GB)
>   Used Dev Size : 3904939008 (1862.02 GiB 1999.33 GB)
>     Data Offset : 252928 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=252840 sectors, after=9488 sectors
>           State : clean
>     Device UUID : 7edd1414:e610975a:fbe4a253:7ff9d404

> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sat Nov 26 09:35:16 2022
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 204aec57 - correct
>          Events : 2554488

>          Layout : left-symmetric
>      Chunk Size : 512K

>    Device Role : spare
>    Array State : ...A.A ('A' == active, '.' == missing, 'R' == replacing)
> $ mdadm -E /dev/dm-11
> /dev/dm-11:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : cc2852b8:aca4bdf8:761739d6:0ca5c3bb
>            Name : treehouse:raid  (local to host treehouse)
>   Creation Time : Mon Jan 29 02:48:26 2018
>      Raid Level : raid6
>    Raid Devices : 6

>  Avail Dev Size : 3904948496 (1862.02 GiB 1999.33 GB)
>      Array Size : 7809878016 (7448.08 GiB 7997.32 GB)
>   Used Dev Size : 3904939008 (1862.02 GiB 1999.33 GB)
>     Data Offset : 252928 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=252840 sectors, after=9488 sectors
>           State : clean
>     Device UUID : e8620025:d7cfec3d:a580f07d:9b7b5e11

> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sat Nov 26 09:47:17 2022
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 4b64514b - correct
>          Events : 2554490

>          Layout : left-symmetric
>      Chunk Size : 512K

>    Device Role : spare
>    Array State : ...A.A ('A' == active, '.' == missing, 'R' == replacing)
> $ mdadm -E /dev/dm-12
> /dev/dm-12:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : cc2852b8:aca4bdf8:761739d6:0ca5c3bb
>            Name : treehouse:raid  (local to host treehouse)
>   Creation Time : Mon Jan 29 02:48:26 2018
>      Raid Level : raid6
>    Raid Devices : 6

>  Avail Dev Size : 3904948496 (1862.02 GiB 1999.33 GB)
>      Array Size : 7809878016 (7448.08 GiB 7997.32 GB)
>   Used Dev Size : 3904939008 (1862.02 GiB 1999.33 GB)
>     Data Offset : 252928 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=252840 sectors, after=9488 sectors
>           State : clean
>     Device UUID : 02cd8021:ece5f701:777c1d5e:1f19449a

> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sun Dec  4 00:57:01 2022
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 750b7a8f - correct
>          Events : 2554495

>          Layout : left-symmetric
>      Chunk Size : 512K

>    Device Role : Active device 3
>    Array State : ...A.A ('A' == active, '.' == missing, 'R' == replacing)
> $ mdadm -E /dev/dm-13
> /dev/dm-13:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : cc2852b8:aca4bdf8:761739d6:0ca5c3bb
>            Name : treehouse:raid  (local to host treehouse)
>   Creation Time : Mon Jan 29 02:48:26 2018
>      Raid Level : raid6
>    Raid Devices : 6

>  Avail Dev Size : 3904948496 (1862.02 GiB 1999.33 GB)
>      Array Size : 7809878016 (7448.08 GiB 7997.32 GB)
>   Used Dev Size : 3904939008 (1862.02 GiB 1999.33 GB)
>     Data Offset : 252928 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=252848 sectors, after=9488 sectors
>           State : clean
>     Device UUID : c7e94388:5d5020e9:51fe2079:9f6a989d

> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sun Dec  4 00:57:01 2022
>   Bad Block Log : 512 entries available at offset 16 sectors
>        Checksum : e14ed4e9 - correct
>          Events : 2554495

>          Layout : left-symmetric
>      Chunk Size : 512K

>    Device Role : Active device 5
>    Array State : ...A.A ('A' == active, '.' == missing, 'R' == replacing)



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux