Re: RAID-6 and write hole with write-intent bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24 Nov 2020, Mukund Sivaraman told this:
[...]
> (a) With RAID-5, assuming there are 4 member disks A, B, C, D, a write
> operation with its data on disk A and stripe's parity on disk B may
> involve:
>
> 1. a read of the stripe
> 2. update of data on A
> 3. computation and update of parity A^C^D on B
>
> These are not atomic updates. If power is lost between steps 2 and 3,

The writes usually proceed in parallel (because anything else would be
abominably slow). But... the problem is that the writes to the component
disks are also not atomic, and will likely not proceed at the same
rates: only with spindle-synched drives is there anything like a
guarantee of that, and those have been unobtainable for decades. So a
power loss could well lead to 500 sectors of the stripe written on disk
A, 430 sectors written on disk B... and the sectors between sector 430
and 500 are not consistent. (Disk C might well be up around sector 600,
disk D around sector 450 and there's no *way* mere parity or RAID 6
syndromes can recover from the wildly-varying mess between sectors 430
and 600... it's not like it gets recorded anywhere where a disk write
got up to before the power went out, either. But the journal avoids this
in the usual fashion for a journal, by writing out the whole thing first
and committing it to stable storage, so that on restart the incomplete
writes can just be replayed.)

-- 
NULL && (void)



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux