Re: raid1 out of sync, but which files are affected?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/02/2019 20:11, Nix wrote:
On 10 Feb 2019, Harald Dunkel spake thusly:

On 1/27/19 12:21 AM, Nik.Brt. wrote:

These mismatches happen, in raid1, but why they happen is not precisely known. There are a few ideas... and it is said that they are harmless in most cases (=outside of files).
The phenomenon happens a lot less if you have LVM over the raid1, and also this is not exactly known why.

This is more than alarming. Do I put my data at risk using software RAID1?

No, because the only situation in which they are known to happen is when
you have a powerdown or crash or similar event when the data has hit one
spindle and not the other. In this case, *either* content is valid: if
you get one, you could have got the other if the machine powered down a
fraction of a second earlier or later. All that matters is that the data
remains conssitent.

No this is a wrong interpretation. RAID should protect against that.
The mismatches you mention should go away shortly after rebooting, because the RAID logic addresses those.

After reboot, if there is no bitmap, one disk (raid1) is taken for good and it is fully replicated onto the other one, and the second disk is not read until replication has passed. On raid 5-6 the data disks are taken for good and the parities are recomputed.

If there is a bitmap, only such regions which are set dirty are recomputed. By using flush the bitmap is always guaranteed to be the first to be set dirty and the last one to be set clean. That of course requires that the disks implement the flush command correctly.

RAID is not meant to protect against data loss on sudden powerdown
(that's the job of UPSes and filesystem journals) nor really against
data loss on single-sector disk damage or intemittent connectivity
problems. It's meant to protect against data loss on whole-disk failure.
That's all. If it protects against other things, good, but other
scenarios are not within the design intent of RAID.

RAID would be very weak if it was so vulnerable.

RAID does protect against single-sector damage IF such disk reports read error. Cabling errors and other logic errors, which don't result in CRC errors on the disk surface, are nasty in this sense. Cabling errors hopefully should report CRC errors at the disk side, visible in dmesg and in SMART.





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux