Re: ALERT: md/raid6 data corruption risk.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 17, 2014 at 11:16 PM, NeilBrown <neilb@xxxxxxx> wrote:
>
> Hi all,
>  There is a risk of data loss with md/raid6 arrays running on Linux since
>  2.6.32.
>  If:
>    - the array is doubly degraded
>    - one or both failed devices are being recovered, and
>    - the array is written to
>
>  then it is possible for data on the array to be lost.  The patch below fixes
>  the problem.  If you apply the patch to an older kernel which has separate
>  handle_stripe5() and handle_stripe6() functions, be sure that patch changes
>  handle_stripe6().
>
>  There is no risk to an optimal array or a singly-degraded array.  There is
>  also no risk on a doubly-degraded array which is not recovering a device or
>  is not receiving write requests.
>
>  If you have data on a RAID6 array, please consider how to avoid corruption,
>  possibly by applying the patch, possibly by removing any hot spares so
>  recovery does not automatically start.
>
>  This patch will be sent upstream shortly and will subsequently appear in
>  future "-stable" kernels.
>
> NeilBrown
>
> From f94e37dce722ec7b6666fd04be357f422daa02b5 Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@xxxxxxx>
> Date: Wed, 13 Aug 2014 09:57:07 +1000
> Subject: [PATCH] md/raid6: avoid data corruption during recovery of
>  double-degraded RAID6
>
> During recovery of a double-degraded RAID6 it is possible for
> some blocks not to be recovered properly, leading to corruption.
>
> If a write happens to one block in a stripe that would be written to a
> missing device, and at the same time that stripe is recovering data
> to the other missing device, then that recovered data may not be written.
>
> This patch skips, in the double-degraded case, an optimisation that is
> only safe for single-degraded arrays.
>
> Bug was introduced in 2.6.32 and fix is suitable for any kernel since
> then.  In an older kernel with separate handle_stripe5() and
> handle_stripe6() functions that patch must change handle_stripe6().
>
> Cc: stable@xxxxxxxxxxxxxxx (2.6.32+)
> Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8
> Cc: Yuri Tikhonov <yur@xxxxxxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Reported-by: "Manibalan P" <pmanibalan@xxxxxxxxxxxxxx>
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423
> Signed-off-by: NeilBrown <neilb@xxxxxxx>
>

Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx>

...with a capital "ACK"!.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux