Re: Filesystem corruption on RAID1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 20, 2017 at 1:14 AM, Mikael Abrahamsson <swmike@xxxxxxxxx> wrote:

> After a non-clean poweroff and possible mismatch now between the RAID1
> drives, and now fsck runs. It reads from the drives and fixes problem.
> However because the RAID1 drives contain different information, some of the
> errors are not fixed. Next time anything comes along, it might read from a
> different drive than what fsck read from, and now we have corruption.

The fsck has no idea this is two drives, it things it's one and does
an overwrite of whatever (virtual) blocks contain file system metadata
needing repair. Then md should take each fsck write, and duplicate it
(for 2 way mirror) and push those writes to each real physical device.

Since md doesn't read from both mirrors, it's possible there's a read
from a non-corrupt drive, which presents good information to fsck,
which then sees no reason to fix anything in that block; but the other
mirror does have corruption which thus goes undetected.

One way of dealing with it is to scrub (repair) so they both have the
same information to hand over to fsck. Fixups then get replicated to
disks by md.

Another way is to split the mirror (make one device faulty), and then
fix the remaining drive (now degraded). If that goes well, the 2nd
device can be re-added. Here's a caveat thought: how it resync's will
depend on the write-intent bitmap being present. I have no idea if
write-intent bitmaps on two drives can get out of sync and what the
ensuing behavior is, but I'd like to think md will discover the fixed
drive event count is higher than the re-added one, and if necessary
does a full resync, rather than possibly re-introducing any
corruption.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux