Re: Suggestion needed for fixing RAID6

Neil Brown <neilb@xxxxxxx> · Mon, 3 May 2010 12:17:47 +1000

On Sat, 1 May 2010 23:44:04 +0200
"Janos Haar" <janos.haar@xxxxxxxxxxxx> wrote:

> The general problem is, i have one single-degraded RAID6 + 2 badblock disk 
> inside wich have bads in different location.
> The big question is how to keep the integrity or how to do the rebuild by 2 
> step instead of one continous?

Once you have the fix that has already been discussed in this thread, the
only other problem I can see with this situation is if attempts to write good
data over the read-errors results in a write-error which causes the device to
be evicted from the array.  And I think you have reported getting write
errors.

The following patch should address this issue for you.  It is *not* a
general-purpose fix, but a specific fix to address an issue you are having.
It might be appropriate to make this configurable via sysfs, or possibly even
to try to auto-detect the situation and don't bother writing.

Longer term I want to add support for storing a bad-block-list per device
so that a write error just fails that block, not the whole device.  I just
need to organise my time so that I make progress on that project.

NeilBrown

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c181438..fd73929 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3427,6 +3427,12 @@ static void handle_stripe6(struct stripe_head *sh)
 			    && !test_bit(R5_LOCKED, &dev->flags)
 			    && test_bit(R5_UPTODATE, &dev->flags)
 				) {
+#if 1
+				/* We have recovered the data, but don't
+				 * trust the device enough to write back
+				 */
+				clear_bit(R5_ReadError, &dev->flags);
+#else
 				if (!test_bit(R5_ReWrite, &dev->flags)) {
 					set_bit(R5_Wantwrite, &dev->flags);
 					set_bit(R5_ReWrite, &dev->flags);
@@ -3438,6 +3444,7 @@ static void handle_stripe6(struct stripe_head *sh)
 					set_bit(R5_LOCKED, &dev->flags);
 					s.locked++;
 				}
+#endif
 			}
 		}
 
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html