Re: Help ironing out persistent mismatches on raid6

Eyal Lebedinsky <eyal@xxxxxxxxxxxxxx> · Fri, 3 Dec 2021 10:33:50 +1100

On 03/12/2021 09.04, Matt Garretson wrote:
Hi, I have this RAID6 array of 6x 8TB drives:

/dev/md1:
            Version : 1.2
      Creation Time : Fri Jul  6 23:20:38 2018
         Raid Level : raid6
         Array Size : 31255166976 (29.11 TiB 32.01 TB)
      Used Dev Size : 7813791744 (7.28 TiB 8.00 TB)
       Raid Devices : 6

There is an ext4 fs on the device (no lvm).

The array for over a year has had 40 contiguous mismatches in the same spot:

md1: mismatch sector in range 2742891144-2742891152
md1: mismatch sector in range 2742891152-2742891160
md1: mismatch sector in range 2742891160-2742891168
md1: mismatch sector in range 2742891168-2742891176
md1: mismatch sector in range 2742891176-2742891184

Sector size is 512, so I guess this works out to be five 4KiB blocks, or
20KiB of space.

The array is checked weekly, but never been "repaired".  The ext4
filesystem has been fsck'd a lot over the years, with no problems.  But
I worry about what file might potentially have bad data in it.  There
are a lot of files.

I have done:

dd status=none if=/dev/md1 ibs=512 skip=2742891144 count=40  |hexdump -C

... and I don't see anything meaningful to me.

I have done  dumpe2fs -h /dev/md1 and it tells me block size is 4096 and
the first block is 0.  So does....

2742891144 * 512 / 4096 = 342861393

...mean we are dealing with blocks # 342861393 - 342861398 of the
filesystem?  If so, is there a way for me to see what file(s) use those
blocks?

Thanks in advance for any tips...
-Matt

I use debugfs to do this. Knowing each fs block range (lo hi) calculated from the raid mismatch notice:

I first identify the relevant blocks in each reported range with
	debugfs -R "testb $lo $((hi-lo))" $device
then locate the associated inodes with
	debugfs -R "icheck $list" $device
and finally discover files in these locations with
	debugfs -R "ncheck $inode" $device

Some of the above debugfs requests can take a very long time to perform. I actually have a script that
does everything and can be left to run for a day (or longer) but it is very locally specific for my setup.

HTH

--
Eyal Lebedinsky (eyal@xxxxxxxxxxxxxx)