Re: Cleaning up a Raid5 after discrepancies discovered

Roman Mamedov <rm@xxxxxxxxxxx> · Fri, 14 Jun 2024 00:55:10 +0500

On Thu, 13 Jun 2024 12:36:50 -0400
dfc <chernoff@xxxxxxxxxxxxxxxxx> wrote:

> I noticed some data inconsistencies in my raid5 (5 disks, 3.6T per
> disk) and discovered via smartmon that 1 disk was about to fail (many
> reallocated sectors). Mismatch_cnt was approximately 128 at this point.
> I don't have a spare 6th disk in the setup.
> 
> I dd'd the failing disk's entire contents (including partition table)
> to a new (8T) disk and inserted it in the array. The new configuration
> was recognized without problems. I ran check without mounting the file
> system. This completed (I failed to check dmesg to see how many
> inconsistencies it found). I mounted the file system and things seemed
> OK.
> 
> Next I did a diff with respect to a backup (unfortunately a close but
> not perfect backup). There were definitely some differencies within
> some binary files.

If I'm not mistaken, the regular RAID5 cannot protect from data corruption; in
case of one RAID member content becoming corrupt (but readable) the recovery
of the affected stripe consistency will likely damage the user data.

If you know one disk has corrupted content, you may be better off removing
that one from the array ASAP, and putting in a clean new disk, then rebuilding
onto that from the known-good other RAID members. (Of course then you take the
usual risk in any RAID5 rebuild, of another drive failing...)

Meanwhile RAID6 supposedly can do better and detect which disk had the wrong
content, but I remember reading something to the effect that this math may or
may not have been implemented in mdadm RAID yet.

To protect from data corruption you need a RAID coupled with a checksumming
filesystem, like Btrfs or ZFS. But Btrfs RAID5/6 are not mature and not
recommended for use.

> My question is "how to clean up this array?"
> 
> Should I try to delete the specific files I know have discrepancies
> and recopy them from the backup? Does that cure the mismatches in the
> space occuppied by those files?

I would say yes, barring some corner case I'm missing. Writing new data will
write the new and consistent stripe content for that data on all disks
including the problematic one.

> What strategy one should take when it's clear that there's been a
> limited amount of bitrot?

If you do not use a checksumming filesystem, have a tool like "cfv" store
checksum files in each dir with rarely-modified content (such as a media
library). If you had those prior to this incident, you could easily recheck
them all and tell which files need to be restored from backups or reobtained
elsewhere, or have to deal with some rot of the content (may not be a fatal
issue for video files for example).

-- 
With respect,
Roman