Re: Raid1 resync problem with leap seconds ?

NeilBrown <neilb@xxxxxxx> · Mon, 9 Jul 2012 11:35:28 +1000

On Mon, 9 Jul 2012 11:09:56 +1000 NeilBrown <neilb@xxxxxxx> wrote:

> On Fri, 06 Jul 2012 14:33:47 +0200 Arnold Schulz <arnysch@xxxxxxx> wrote:
> 
> > Hi all,
> > 
> > about 8 seconds after inserting the leap second, a running raid1
> > resync crashed.
> 
> Thanks for the report.
> 
> I think you mean "8 minutes" (though it was really 7 minutes and 12 seconds).
> 
> Also it was a 'data-check' rather than a 'resync' :-)
> 
> It is extremely unlikely that the two are related.
> 
> There appears to be a use-after-free bug in the data-check code which you
> have manage to hit.  It has been there since 2006 (2.6.16) when data-check was
> added to raid1, and you are the first known victim.  Well done!
> 
> I'll submit a patch shortly.
> 

Below is that patch I'll be submitting, once it has been in -next for a day
or two.

Thanks,
NeilBrown


From 2d4f4f3384d4ef4f7c571448e803a1ce721113d5 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@xxxxxxx>
Date: Mon, 9 Jul 2012 11:34:13 +1000
Subject: [PATCH] md/raid1: fix use-after-free bug in RAID1 data-check code.

This bug has been present ever since data-check was introduce
in 2.6.16.  However it would only fire if a data-check were
done on a degraded array, which was only possible if the array
has 3 or more devices.  This is certainly possible, but is quite
uncommon.

Since hot-replace was added in 3.3 it can happen more often as
the same condition can arise if not all possible replacements are
present.

The problem is that as soon as we submit the last read request, the
'r1_bio' structure could be freed at any time, so we really should
stop looking at it.  If the last device is being read from we will
stop looking at it.  However if the last device is not due to be read
from, we will still check the bio pointer in the r1_bio, but the
r1_bio might already be free.

So use the read_targets counter to make sure we stop looking for bios
to submit as soon as we have submitted them all.

This fix is suitable for any -stable kernel since 2.6.16.

Cc: stable@xxxxxxxxxxxxxxx
Reported-by: Arnold Schulz <arnysch@xxxxxxx>
Signed-off-by: NeilBrown <neilb@xxxxxxx>

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 8c2754f..240ff31 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2485,9 +2485,10 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
 	 */
 	if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
 		atomic_set(&r1_bio->remaining, read_targets);
-		for (i = 0; i < conf->raid_disks * 2; i++) {
+		for (i = 0; i < conf->raid_disks * 2 && read_targets; i++) {
 			bio = r1_bio->bios[i];
 			if (bio->bi_end_io == end_sync_read) {
+				read_targets--;
 				md_sync_acct(bio->bi_bdev, nr_sectors);
 				generic_make_request(bio);
 			}
Attachment:
signature.asc

Description: PGP signature