Cleaning up a Raid5 after discrepancies discovered

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


I noticed some data inconsistencies in my raid5 (5 disks, 3.6T per
disk) and discovered via smartmon that 1 disk was about to fail (many
reallocated sectors). Mismatch_cnt was approximately 128 at this point.
I don't have a spare 6th disk in the setup.

I dd'd the failing disk's entire contents (including partition table)
to a new (8T) disk and inserted it in the array. The new configuration
was recognized without problems. I ran check without mounting the file
system. This completed (I failed to check dmesg to see how many
inconsistencies it found). I mounted the file system and things seemed

Next I did a diff with respect to a backup (unfortunately a close but
not perfect backup). There were definitely some differencies within
some binary files.

So I ran check (again) at this point and see some mismatched sectors
in the dmesg log and the Mismatch_cnt is 128.

My question is "how to clean up this array?"

Should I try to delete the specific files I know have discrepancies
and recopy them from the backup? Does that cure the mismatches in the
space occuppied by those files?

I have seen a post with a user filling the disk with zero's and then
deleting that file to deal with mismatches in free space.

What strategy one should take when it's clear that there's been a
limited amount of bitrot?


PS Detailed information is attached below


$ uname -a
Linux xxxxxxx 6.8.11-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Sun May 26
20:05:41 UTC 2024 x86_64 GNU/Linux

$ mdadm --version
mdadm - v4.2 - 2021-12-30

$ more /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 sdi1[3] sdk1[1] sdl1[0] sdj1[2] sdh1[5]
      15627542528 blocks super 1.2 level 5, 512k chunk, algorithm 2
[5/5] [UUUUU
      bitmap: 0/30 pages [0KB], 65536KB chunk

unused devices: <none>

$ more /sys/block/md127/md/mismatch_cnt 

checking operation:

$more dmesg

[518371.195611] md/raid:md127: device sdi1 operational as raid disk 3
[518371.195621] md/raid:md127: device sdk1 operational as raid disk 1
[518371.195625] md/raid:md127: device sdl1 operational as raid disk 0
[518371.195627] md/raid:md127: device sdj1 operational as raid disk 2
[518371.195630] md/raid:md127: device sdh1 operational as raid disk 4
[518371.197612] md/raid:md127: raid level 5 active with 5 out of 5
devices, algorithm 2
[518371.233040] md127: detected capacity change from 0 to 31255085056
[518615.655340] SGI XFS with ACLs, security attributes, realtime,
scrub, quota, no debug enabled
[518615.661545] XFS (md127): Deprecated V4 format (crc=0) will not be
supported after September 2030.
[518615.661970] XFS (md127): Mounting V4 Filesystem 134d3d10-3a73-462d-
[518616.108155] XFS (md127): Starting recovery (logdev: internal)
[518616.182117] XFS (md127): Ending recovery (logdev: internal)
[518616.182357] XFS (md127): Unmounting Filesystem 134d3d10-3a73-462d-
[518633.338736] XFS (md127): Mounting V4 Filesystem 134d3d10-3a73-462d-
[518633.740966] XFS (md127): Ending clean mount
[525118.638537] md: data-check of RAID array md127
[560647.736826] perf: interrupt took too long (6462 > 6453), lowering
kernel.perf_event_max_sample_rate to 30000
[757745.588678] md127: mismatch sector in range 3574914288-3574914296
[757745.588690] md127: mismatch sector in range 3574914296-3574914304
[757748.955261] md127: mismatch sector in range 3575062536-3575062544
[757827.106584] md127: mismatch sector in range 3576178688-3576178696
[779366.372926] md127: mismatch sector in range 3907250080-3907250088
[779383.573705] md127: mismatch sector in range 3907600576-3907600584
[820930.145928] md127: mismatch sector in range 4559852464-4559852472
[820930.145940] md127: mismatch sector in range 4559852472-4559852480
[820930.145943] md127: mismatch sector in range 4559852480-4559852488
[820930.145946] md127: mismatch sector in range 4559852488-4559852496
[820930.145948] md127: mismatch sector in range 4559852496-4559852504
[820930.145953] md127: mismatch sector in range 4559852504-4559852512
[820930.145955] md127: mismatch sector in range 4559852512-4559852520
[820930.145958] md127: mismatch sector in range 4559852520-4559852528
[820930.145960] md127: mismatch sector in range 4559852528-4559852536
[820930.145963] md127: mismatch sector in range 4559852536-4559852544
[1024770.015887] md: md127: data-check done.

$ sudo mdadm --examine /dev/sdi
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
$ sudo mdadm --examine /dev/sdj
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
$ sudo mdadm --examine /dev/sdk
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
$ sudo mdadm --examine /dev/sdl
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
$ sudo mdadm --examine /dev/sdh
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
$ sudo mdadm --examine /dev/sdi1
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 954b2546:5c467e9c:a4eb74e3:27dad837
           Name : impala:0
  Creation Time : Fri May 22 15:32:31 2015
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 7813771264 sectors (3.64 TiB 4.00 TB)
     Array Size : 15627542528 KiB (14.55 TiB 16.00 TB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 2e1a57ff:f892fb23:1f698390:53dd98e3

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jun 13 06:44:05 2024
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 2f92510e - correct
         Events : 209201

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAA ('A' == active, '.' == missing, 'R' ==
$ sudo mdadm --examine /dev/sdj1
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 954b2546:5c467e9c:a4eb74e3:27dad837
           Name : impala:0
  Creation Time : Fri May 22 15:32:31 2015
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 7813771264 sectors (3.64 TiB 4.00 TB)
     Array Size : 15627542528 KiB (14.55 TiB 16.00 TB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 3be7bbb4:4e5f07e3:f78f3c31:5bd6df6b

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jun 13 06:44:05 2024
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : cd030a4f - correct
         Events : 209201

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAA ('A' == active, '.' == missing, 'R' ==
$ sudo mdadm --examine /dev/sdk1
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 954b2546:5c467e9c:a4eb74e3:27dad837
           Name : impala:0
  Creation Time : Fri May 22 15:32:31 2015
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 7813771264 sectors (3.64 TiB 4.00 TB)
     Array Size : 15627542528 KiB (14.55 TiB 16.00 TB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 2b09eed0:0a6ead54:48671d28:0abd1b6e

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jun 13 06:44:05 2024
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 53f7fcb2 - correct
         Events : 209201

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAA ('A' == active, '.' == missing, 'R' ==
$ sudo mdadm --examine /dev/sdl1
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 954b2546:5c467e9c:a4eb74e3:27dad837
           Name : impala:0
  Creation Time : Fri May 22 15:32:31 2015
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 7813771264 sectors (3.64 TiB 4.00 TB)
     Array Size : 15627542528 KiB (14.55 TiB 16.00 TB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 324b49de:233d8769:7f75afad:dddb0ec8

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jun 13 06:44:05 2024
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 55b23724 - correct
         Events : 209201

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAA ('A' == active, '.' == missing, 'R' ==
$ sudo mdadm --examine /dev/sdh1
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 954b2546:5c467e9c:a4eb74e3:27dad837
           Name : impala:0
  Creation Time : Fri May 22 15:32:31 2015
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 7813771264 sectors (3.64 TiB 4.00 TB)
     Array Size : 15627542528 KiB (14.55 TiB 16.00 TB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : f0cda836:8c1c28d1:53710d20:db8d088a

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jun 13 06:44:05 2024
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 4a0a4721 - correct
         Events : 209201

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAA ('A' == active, '.' == missing, 'R' ==

$ lsblk
NAME                            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda                               8:0    0 476.9G  0 disk  
├─sda1                            8:1    0    10G  0 part  /boot
└─sda2                            8:2    0 466.9G  0 part  
  └─fedora_localhost--live-root 253:0    0 466.9G  0 lvm   /
sdb                               8:16   0   7.3T  0 disk  
└─sdb1                            8:17   0   7.3T  0 part  
sdc                               8:32   0   1.8T  0 disk  
└─sdc1                            8:33   0   1.8T  0 part  
sdd                               8:48   0   7.3T  0 disk  
└─sdd1                            8:49   0   7.3T  0 part  
sde                               8:64   0   1.8T  0 disk  
└─sde1                            8:65   0   1.8T  0 part  
sdf                               8:80   1     0B  0 disk  
sdg                               8:96   1     0B  0 disk  
sdh                               8:112  0   3.6T  0 disk  
└─sdh1                            8:113  0   3.6T  0 part  
  └─md127                         9:127  0  14.6T  0 raid5 /mnt/backup
sdi                               8:128  0   3.6T  0 disk  
└─sdi1                            8:129  0   3.6T  0 part  
  └─md127                         9:127  0  14.6T  0 raid5 /mnt/backup
sdj                               8:144  0   3.6T  0 disk  
└─sdj1                            8:145  0   3.6T  0 part  
  └─md127                         9:127  0  14.6T  0 raid5 /mnt/backup
sdk                               8:160  0   7.3T  0 disk  
└─sdk1                            8:161  0   3.6T  0 part  
  └─md127                         9:127  0  14.6T  0 raid5 /mnt/backup
sdl                               8:176  0   3.6T  0 disk  
└─sdl1                            8:177  0   3.6T  0 part  
  └─md127                         9:127  0  14.6T  0 raid5 /mnt/backup
sr0                              11:0    1  1024M  0 rom   
sr1                              11:1    1  1024M  0 rom   
zram0                           252:0    0     8G  0 disk  [SWAP]

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]

  Powered by Linux